Question answering on interlinked data

Saeedeh Shekarpour, Axel-Cyrille Ngonga Ngomo, Sören Auer
2013 Proceedings of the 22nd international conference on World Wide Web - WWW '13  
The Data Web contains a wealth of knowledge on a large number of domains. Question answering over interlinked data sources is challenging due to two inherent characteristics. First, different datasets employ heterogeneous schemas and each one may only contain a part of the answer for a certain question. Second, constructing a federated formal query across different datasets requires exploiting links between the different datasets on both the schema and instance levels. We present a question
more » ... ering system, which transforms user supplied queries (i.e. natural language sentences or keywords) into conjunctive SPARQL queries over a set of interlinked data sources. The contribution of this paper is two-fold: Firstly, we introduce a novel approach for determining the most suitable resources for a user-supplied query from different datasets (disambiguation). We employ a hidden Markov model, whose parameters were bootstrapped with different distribution functions. Secondly, we present a novel method for constructing a federated formal queries using the disambiguated resources and leveraging the linking structure of the underlying datasets. This approach essentially relies on a combination of domain and range inference as well as a link traversal method for constructing a connected graph which ultimately renders a corresponding SPARQL query. The results of our evaluation with three life-science datasets and 25 benchmark queries demonstrate the effectiveness of our approach.
doi:10.1145/2488388.2488488 dblp:conf/www/ShekarpourNA13 fatcat:ydb2ukwqm5exfgtopqltxk5jei