Completion-based generalization inferences for the Description Logic ELOR with subjective probabilities

Andreas Ecke, Rafael Peñaloza, Anni-Yasmin Turhan
2014 International Journal of Approximate Reasoning  
Description Logics (DLs) are a well-established family of knowledge representation formalisms. One of its members, the DL ELOR has been successfully used for representing knowledge from the bio-medical sciences, and is the basis for the OWL 2 EL profile of the standard ontology language for the Semantic Web. Reasoning in this DL can be performed in polynomial time through a completion-based algorithm. In this paper we study the logic Prob-ELOR, that extends ELOR with subjective probabilities,
more » ... d present a completion-based algorithm for polynomial time reasoning in a restricted version, Prob-ELOR 01 c , of Prob-ELOR. We extend this algorithm to computation algorithms for approximations of (i) the most specific concept, which generalizes a given individual into a concept description, and (ii) the least common subsumer, which generalizes several concept descriptions into one. Thus, we also obtain methods for these inferences for the OWL 2 EL profile. These two generalization inferences are fundamental for building ontologies automatically from examples. The feasibility of our approach is demonstrated empirically by our prototype system Gel. Obese P 0.9 ∃hasCondition.HighPressure. While most DLs studied in [1] are intractable or even undecidable for unrestricted probabilistic roles, a fragment Prob-EL 01 c extending EL was identified to still admit polynomial time reasoning. In this fragment, probabilistic concepts can be constructed using only the probabilities >0 and =1. A completion algorithm for classifying TBoxes in the language Prob-EL 01 c was described in [1] . However, the algorithm described by the authors is not complete-the corrected version is given in this paper, since it is needed in our algorithms for computing generalizations. Beyond the standard reasoning services, there also exist a number of non-standard inferences like the generalization of different entities from DL knowledge bases. The least common subsumer (lcs) inference introduced in [14] generalizes a set JID:IJA AID:7704 /FLA [m3G; v 1.131; Prn:28/03/2014; 16:10] P.3 (1-32) A. Ecke et al. / International Journal of Approximate Reasoning of concept descriptions into a single new concept description that subsumes all the input concepts and that is least w.r.t. subsumption. Intuitively, the lcs captures all commonalities of the input concept descriptions. A second inference, the most specific concept (msc) [15] , generalizes an individual into the most precise concept description that describes this individual. Given the previous axioms that describes obese persons and mothers, assume that we have the additional knowledge that Mary is obese: Obese(mary). Then the msc of mary is the concept Obese P 0.9 ∃hasCondition.HighPressure Mother Female Person Woman ∃has-child.Male, which is incidentally equivalent to simply Obese Mother ∃has-child.Male. The lcs of this concept and Person P 0.6 ∃has-condition.RadiusFracture (which might occur if an x-ray only shows a vague line) is Person P 0.6 ∃has-condition. . These generalization inferences have a variety of applications. In the bottom-up construction of knowledge bases new concept descriptions can be generated in an example-driven way from a set of individuals that a user selects [15, 16] . Each of the selected individuals is first generalized into a concept description by the msc and then all of these concept descriptions are generalized into a single one by the lcs. This approach enables users of DL knowledge bases with little KR expertise to augment their ontologies with new concepts. Another application of generalization inferences are concept similarity measures [17, 18] . These measures assess the similarity of two concepts and are the core of many ontology matching algorithms. Furthermore, in ontology-based information retrieval the msc and lcs are used to relax search concepts, which encode the information to be searched [19] [20] [21] . For more application of these generalization inferences see [16, 2] . Neither the lcs nor the msc need to exist in EL, if computed w.r.t. general or cyclic TBoxes [22] or cyclic ABoxes [23] . The reason is that the cyclic structure cannot be captured by a finite EL-concept description. In [24] an extension of EL with greatest fixpoints was introduced, where the generalization concepts always exist. Earlier in [25] it was shown that under greatest fixed point semantics the lcs does exist. However, for both approaches the resulting DL may not be as easy to comprehend for a DL system user. Thus, we pursue a different approach here. Computation algorithms for approximative solutions for the lcs were devised in [2] and for the msc in [26] . These methods simply compute a generalization concept up to a certain size k, which is interpreted as a bound on the role-depth, i.e., the maximal nestings of quantifiers. One way to compute the approximative generalizations k-lcs and k-msc is to use the canonical model constructed by the completion algorithm for EL. This approach has been studied intensively and extended to ELR and EL with inverse roles [2,27,28]. Furthermore, completion-based classification algorithms become more widely used, both from a practical point of view in terms of reasoner implementations [29-31] as well as on the theoretical side with the recent extensions of EL with nominals [32], subjective probabilities [1] or even Horn variants of expressive DLs [33]. In cases where the lcs or msc exists and a large enough bound k was given, the methods for computing the role-depth bounded lcs and the role-depth bounded msc yield the exact solutions. However, to obtain the least common subsumer and the most specific concept by these methods in practice, a decision procedure for the existence of the lcs or msc, respectively, and a method for computing a sufficient k are needed. These have recently been supplied for EL in [34] and for EL extended by complex role inclusions in [35] . Although being a rather pragmatic approach, the role-depth bounded lcs and the role-depth bounded msc may yield approximations that are sufficient for most practical applications named above. Other applications require the notion of role-depth bounded generalizations. For example, [21] solves the problem of instance queries for concepts relaxed by similarity measures by computing a so-called mimic of the query concept w.r.t. a candidate individual a, which can be found by considering subconcepts of the role-depth bounded msc of a. Curé et al. [36] describe an application that evaluates user traces by making use of the probabilistic DLs as defined by Lutz and Schröder [1] . Interestingly, the authors need to compute the msc (and afterwards the lcs) for k = 1 in their application. They give an ad-hoc procedure to compute these inferences. Now, since their method for the 1-msc does not take the TBox information into account, their algorithm is not correct. In this paper we devise algorithms for computing the role-depth bounded generalization for Prob-EL 01 c and some of its extensions and we prove their correctness. In detail, the contributions of this paper are the following: Classification algorithms. We give a uniform description of the completion-based classification procedures for the DLs ELOR and Prob-ELO 01 c , i.e., Prob-EL 01 c extended by nominals. We also amend an error in the completion algorithm for Prob-EL 01 c presented in [1] . We show correctness of the extension of the amended algorithm to handle nominals. Computation algorithms for the role-depth bounded lcs. The completion algorithms for classification are the basis, on which we develop algorithms to compute the role-depth bounded lcs in ELOR and Prob-ELO 01 c . We also show correctness of our methods. Computation algorithms for the role-depth bounded msc. Since the msc in the presence of nominals is trivial (msc(a) = {a}), another target DL should be considered in order to yield an informative version of the msc. Thus we consider EL and later Prob-EL 01 c as the target DL for the msc. Based on the completion algorithms for classification in ELOR and Prob-ELO 01 c , we develop algorithms to compute the role-depth bounded msc w.r.t. KBs written in ELOR and Prob-ELO 01 c and show correctness of these methods.
doi:10.1016/j.ijar.2014.03.001 fatcat:tlhvh3mc4bhx3o3b2k22cuu2li