Completing the is-a structure in light-weight ontologies

Patrick Lambrix, Fang Wei-Kleiner, Zlatan Dragisic
2015 Journal of Biomedical Semantics  
With the increasing presence of biomedical data sources on the Internet more and more research effort is put into finding possible ways for integrating and searching such often heterogeneous sources. Ontologies are a key technology in this effort. However, developing ontologies is not an easy task and often the resulting ontologies are not complete. In addition to being problematic for the correct modelling of a domain, such incomplete ontologies, when used in semantically-enabled applications,
more » ... can lead to valid conclusions being missed. Results: We consider the problem of repairing missing is-a relations in ontologies. We formalize the problem as a generalized TBox abduction problem. Based on this abduction framework, we present complexity results for the existence, relevance and necessity decision problems for the generalized TBox abduction problem with and without some specific preference relations for ontologies that can be represented using a member of the E L family of description logics. Further, we present algorithms for finding solutions, a system as well as experiments. Conclusions: Semantically-enabled applications need high quality ontologies and one key aspect is their completeness. We have introduced a framework and system that provides an environment for supporting domain experts to complete the is-a structure of ontologies. We have shown the usefulness of the approach in different experiments. For the two Anatomy ontologies from the Ontology Alignment Evaluation Initiative, we repaired 94 and 58 initial given missing is-a relations, respectively, and detected and repaired additionally, 47 and 10 missing is-a relations. In an experiment with BioTop without given missing is-a relations, we detected and repaired 40 new missing is-a relations. these ontologies thus providing means for annotating and sharing biomedical data sources. Many of the ontologies in the biomedical domain, e.g., SNOMED [4] and Gene Ontology [5], are, regarding knowledge representation, light-weight ontologies. They are taxonomies or can be represented using the EL description logic or small extensions thereof (e.g. [6] and the TONES Ontology Repository [7]) a . Therefore, in this paper, we consider ontologies that are represented by TBoxes in the EL family, which consist of axioms such as Carditis Fracture, with the intended meaning that Carditis is a Fracture, where Carditis and Fracture are concepts and the relationship is an is-a relation. (For detailed syntax see Section Preliminaries). A set of such terminological axioms is a TBox. Developing ontologies is not an easy task and often the resulting ontologies (including their is-a structures) are
doi:10.1186/s13326-015-0002-8 pmid:25883780 pmcid:PMC4399482 fatcat:yulvav4n6jhuvcryadiw3bhune