SPARQL query rewriting for implementing data integration over linked data
Proceedings of the 1st International Workshop on Data Semantics - DataSem '10
There has been lately an increased activity of publishing structured data in RDF due to the activity of the Linked Data community 1 . The presence on the Web of such a huge information cloud, ranging from academic to geographic to gene related information, poses a great challenge when it comes to reconcile heterogeneous schemas adopted by data publishers. For several years, the Semantic Web community has been developing algorithms for aligning data models (ontologies). Nevertheless, exploiting
... uch ontology alignments for achieving data integration is still an under supported research topic. The semantics of ontology alignments, often defined over a logical frameworks, implies a reasoning step over huge amounts of data, that is often hard to implement and rarely scales on Web dimensions. This paper presents an algorithm for achieving RDF data mediation based on SPARQL query rewriting. The approach is based on the encoding of rewriting rules for RDF patterns that constitute part of the structure of a SPARQL query. of data sets, published in RDF format, is emerging, fuelled primarily by the efforts of the Linked Data community that advocates the adoption of simple design principles 2 in order to create a "Web of Data". Key factors in the success of such a vision of a network of machine readable information is the establishing of a set of wide used standards and procedures, and can be summarised in four main points: • Identify resources with URIs. • Use of HTTP URIs instead of proprietary schemes • Use resolvable URIs so that users can retrieve information about resources using HTTP lookup.