Explainable methods for knowledge graph refinement and exploration via symbolic reasoning [article]

Mohamed Hassan Mohamed Gad-Elrab, Universität Des Saarlandes
Knowledge Graphs (KGs) have applications in many domains such as Finance, Manufacturing, and Healthcare. While recent efforts have created large KGs, their content is far from complete and sometimes includes invalid statements. Therefore, it is crucial to enhance both the coverage and accuracy of KGs through KG completion and KG validation, together referred to as KG refinement. In this context, it is also vital to provide human-comprehensible explanations for the KG refinement output so that
more » ... mans have trust in the refined KG quality. KG exploration, by search and browsing, is essential for users to understand the KG value and limitations towards down-stream applications. However, the large size of KGs makes KG exploration challenging. While the type taxonomy of KGs is a useful asset along these lines, it remains insufficient for deep exploration. This dissertation tackles the challenges of KG refinement and KG exploration by logical reasoning over the KG in combination with other techniques such as KG embedding models and text mining. We introduce methods for these goals which provide humanunderstandable output. Concretely, the dissertation consists of the following contributions: • To tackle KG incompleteness, we present ExRuL, a method for revising Horn rules by adding exceptions (i.e., negated atoms) to their bodies. Learned rules can be used to predict new facts to fill gaps in the KG. Experiments on real-world KGs show that exception-aware rules vastly reduce the error rate in fact prediction. Besides, rules provide user-comprehensible explanations for these predictions. • We also present RuLES, a rule learning method that utilizes probabilistic representations of missing facts. The method iteratively extends the rules induced from a KG by incorporating feedback from a precomputed KG embedding combined with text corpora. The method harnesses newly devised measures for rule quality. RuLES improves the quality of the learned rules and their predictions. • To support KG validation, we propose ExFaKT, a framework for constructing humancomprehensible explanations for candidate facts. The method uses rules to rewrite a iii candidate fact into a set of related facts that are easier to spot and confirm (or refute). The output of ExFaKT is a set of semantic traces for the candidate facts from both text and the KG. Experiments show that rule-based rewriting significantly improves the recall of the discovered traces while preserving a high precision. Furthermore, the explanations support both manual and automatic KG validation. • To facilitate KG exploration, we introduce ExCut, a method that combines KG embeddings with rule mining to compute informative entity clusters with explanations. Cluster explanation consists of a concise combination of entity relations that distinguish this cluster. ExCut jointly enhances the quality of entity clusters and their explanations by iteratively interleaving the learning of embeddings and rules. Experiments show that ExCut produces high-quality clusters, and the explanations computed for them help humans understand the commonalities among entities within these clusters. iv bgmntkdcedldmsr Ctqhmf sghr sgdrhr+ H g]ud kd]qmdc rdudq]k drrdmsh]k qdrd]qbg rjhkkr sg]s+ H adkhdud+ vhkk rg]od lx qdrd]qbg b]qddq-Sgdqdenqd+ H vntkc khjd sn dwoqdrr lx rhmbdqd fq]shstcd sn Oqne-Fdqg]qc Vdhjtl enq fhuhmf ld sgd noonqstmhsx sn vnqj tmcdq ghr rtodquhrhnm hm rtbg ] ohnmddqhmf fqnto+ enq e]bhkhs]shmf sgd qdrd]qbg ]mc enq ghr u]kt]akd ]cuhbd sgqntfgnts sgd sgdrhr-H ]krn vntkc khjd sn rgnv lx rhmbdqd fq]shstcd ]mc ]ooqdbh]shnm enq lx ]cuhrnq Lng]ldc ?lhq enq ghr bnmshmtntr fthc]mbd ]mc rtoonqs nm sgd oqnedrrhnm]k ]mc odqrnm]k kdudkr-H qd]kkx ]ooqdbh]sd ghr o]shdmbd sd]bghmf ld kn]cr ne drrdmsh]k qdrd]qbg+ bnlltmhb], shnm ]mc ok]mmhmf rjhkkr-H ]l dwsqdldkx sg]mjetk enq ghr fdmdqnrhsx rg]qhmf ghr u]kt]akd dwodqshrd ]mc shld-Vnqjhmf vhsg ghl v]r nmd ne sgd qhbgdrs dwodqhdmbdr hm lx khed-H vhrg sn g]ud sgd noonqstmhsx sn vnqj vhsg ghl ]f]hm hm sgd etstqd-H vntkc khjd sn sg]mj ?jq]l Dk,jnq]rgx+ L]x]mj Fnx]k ]mc Ty]hq L]glntc enq sgdhq trdetk eddca]bj qdf]qchmf sgd sgdrhr vqhshmf-?krn+ ntq dwodqhldmsr vntkc mns ad ffm]khydc vhsgnts sgd gdko ne sgd oqn]bshud unktmsddqr vgn ]fqddc sn o]qshbho]sd hm ntq l]mt]k ]rrdrrldms-H vntkc ]krn khjd sn sg]mj sgd qduhdvdqr ne sghr sgdrhr enq sgdhq oqdbhntr shld ]mc d nqs-Ax sgd dmc ne sghr l]rsdqr oqnfq]l+ H ]l fq]sdetk sn sgd HmsdqmYshnmYk LYw+ OkYmbj QdrdYqbg Rbgnnk enq Bnlotsdq Rbhdmbd 'HLOQR+BR( e]lhkx enq sgdhq rtoonqs sgqntfgnts sgd l]rsdq oqnfq]l-H adkhdud+ H v]r enqstm]sd dmntfg sn ad o]qs ne sghr ahf e]lhkx-Nm sgd odqrnm]k kdudk+ vnqcr vhkk mdudq ad dmntfg sn dwoqdrr gnv H ]l sg]mjetk ]mc hmcdasdc sn lx e]lhkx 'lx o]qdmsr+ rhrsdq ]mc aqnsgdq( enq sgdhq rhmbdqd rtoonqs+ dmbntq]fdldms ]mc oq]xdqr sgqntfgnts lx khed ]mc lx knmf dctb]shnm intqmdx-H ]ooqdbh]sd sgdhq o]shdmbd snv]qcr lx bnmshmtntr ]ardmbd-H vntkc khjd sn sg]mj ]kk lx eqhdmcr hm R]]qaq-tbjdm-H adkhdud-H ]l akdrrdc adhmf rtqqntmcdc ax ]kk sgnrd hmsdkkhfdms+ b]qhmf ]mc dmsgtrh]rshb odqrnm]khshdr-Ehm]kkx+ H vntkc khjd sn dwsdmc lx rdmrd ne fq]shstcd sn dudqxnmd dwoqdrrdc ghr rtoonqs ]mc.nq l]cd Ct] enq ld-Adhihmf+ Bghm]-Lng]ldc F]c,Dkq]a Itkx+ 1/04
doi:10.22028/d291-34423 fatcat:hvpoxkfc5zgmbce32pfbcvmejy