Efficient identification of implicit facts in incomplete OWL2-EL knowledge bases

John Liagouris, Manolis Terrovitis
2014 Proceedings of the VLDB Endowment  
Integrating incomplete and possibly inconsistent data from various sources is a challenge that arises in several application areas, especially in the management of scientific data. A rising trend for data integration is to model the data as axioms in the Web Ontology Language (OWL) and use inference rules to identify new facts. Although there are several approaches that employ OWL for data integration, there is little work on scalable algorithms able to handle large datasets that do not fit in
more » ... ain memory. The main contribution of this paper is an algorithm that allows the effective use of OWL for integrating data in an environment with limited memory. The core idea is to exhaustively apply a set of complex inference rules on large disk-resident datasets. To the best of our knowledge, this is the first work that proposes an I/O-aware algorithm for tackling with such an expressive subset of OWL like the one we address here. Previous approaches considered either simpler models (e.g. RDFS) or main-memory algorithms. In the paper we detail the proposed algorithm, prove its correctness, and experimentally evaluate it on real and synthetic data.
doi:10.14778/2733085.2733104 fatcat:vht4ep5uavgvlbd3qstglae5p4