Towards semantic interoperability: finding and repairing hidden contradictions in biomedical ontologies [article]

Luke T Slater, Georgios V Gkoutos, Robert Hoehndorf
2020 biorxiv/medrxiv   pre-print
Ontologies are widely used throughout the biomedical domain. These ontologies formally represent the classes and relations assumed to exist within a domain. As scientific domains are deeply interlinked, so too are their representations. While individual ontologies can be tested for consistency and coherency using automated reasoning methods, systematically combining ontologies of multiple domains together may reveal previously hidden contradictions. Results: We developed a method that tests for
more » ... hidden unsatisfiabilities in an ontology that arise when combined with other ontologies. For this purpose, we combine sets of ontologies and use automated reasoning to determine whether unsatisfiable classes are present. We test the mutual consistency of the OBO Foundry and the OBO ontologies and find that the combined OBO Foundry gives rise to at least 636 unsatisfiable classes, while the OBO ontologies give rise to more than 300,000 unsatisfiable classes. We design and implement a novel algorithm that can determine justifications for contradictions across extremely large and complicated ontologies, and use these justifications to semi-automatically repair ontologies by identifying the minimal set of axioms that, when removed, result in a consistent and coherent set of ontologies. We applied our algorithm to each combination of OBO ontologies that resulted in unsatisfiable classes. Conclusions: We identified a large set of hidden unsatisfiability across a broad range of biomedical ontologies, and we find that this large set of unsatisfiable classes is the result of a relatively small amount of axiomatic disagreements. Our results show that hidden unsatisfiability is a serious problem in ontology interoperability; however, our results also provide a way towards more consistent ontologies by addressing the issues we identified.
doi:10.1101/2020.05.16.099309 fatcat:4rr5bjr4drecldakravt32gqse