Ontology-Based Integration of Cross-Linked Datasets [chapter]

Diego Calvanese, Martin Giese, Dag Hovland, Martin Rezk
2015 Lecture Notes in Computer Science  
In this paper we tackle the problem of answering SPARQL queries over virtually integrated databases. We assume that the entity resolution problem has already been solved and explicit information is available about which records in the different databases refer to the same real world entity. Surprisingly, to the best of our knowledge, there has been no attempt to extend the standard Ontology-Based Data Access (OBDA) setting to take into account these DB links for SPARQL query-answering and
more » ... tency checking. This is partly because the OWL built-in owl:sameAs property, the most natural representation of links between data sets, is not included in OWL 2 QL, the de facto ontology language for OBDA. We formally treat several fundamental questions in this context: how links over database identifiers can be represented in terms of owl:sameAs statements, how to recover rewritability of SPARQL into SQL (lost because of owl:sameAs statements), and how to check consistency. Moreover, we investigate how our solution can be made to scale up to large enterprise datasets. We have implemented the approach, and carried out an extensive set of experiments showing its scalability.
doi:10.1007/978-3-319-25007-6_12 fatcat:rkm7obtudvgrtfrq5r6pynfuuq