While I attached a lot of specific information about the data and
management part below, I would like to highlight the Maven parts:
# Download of Data
* http://databus.dbpedia.org is inspired by maven central and archiva.
Software is much smaller in size and of course, we can not host all of
it, therefore we just keep the metadata with links to decentral downloadURLs
The DBpedia Databus is a platform to capture invested effort by data
consumers who needed better data quality (fitness for use) in order to
use the data and give improvements back to the data source and other
consumers. DBpedia Databus enables anybody to build an automated
DBpedia-style extraction, mapping and testing for any data they need.
Databus incorporates features from DNS, Git, RSS, online forums and
Maven to harness the full workpower of data consumers.
Professional consumers of data worldwide have already built stable
cleaning and refinement chains for all available datasets, but their
efforts are invisible and not reusable. Deep, cleaned data silos exist
beyond the reach of publishers and other consumers trapped locally in
*Data is not oil that flows out of inflexible pipelines*. Databus breaks
existing pipelines into individual components that together form a
decentralized, but centrally coordinated data network in which data can
flow back to previous components, the original sources, or end up being
consumed by external components,
The Databus provides a platform for re-publishing these files with very
little effort (leaving file traffic as only cost factor) while offering
the full benefits of built-in system features such as automated
publication, structured querying, automatic ingestion, as well as
pluggable automated analysis, data testing via continuous integration,
and automated application deployment *(software with data)*. The impact
is highly synergistic, just a few thousand professional consumers and
research projects can expose millions of cleaned datasets, which are on
par with what has long existed in deep silos and pipelines.
1 Billion interconnected, quality-controlled Knowledge Graphs until 2025