FleXPath

Sihem Amer-Yahia, Laks V. S. Lakshmanan, Shashank Pandit
2004 Proceedings of the 2004 ACM SIGMOD international conference on Management of data - SIGMOD '04  
Querying XML data is a well-explored topic with powerful databasestyle query languages such as XPath and XQuery set to become W3C standards. An equally compelling paradigm for querying XML documents is full-text search on textual content. In this paper, we study fundamental challenges that arise when we try to integrate these two querying paradigms. While keyword search is based on approximate matching, XPath has exact match semantics. We address this mismatch by considering queries on
more » ... as a "template", and looking for answers that best match this template and the full-text search. To achieve this, we provide an elegant definition of relaxation on structure and define primitive operators to span the space of relaxations. Query answering is now based on ranking potential answers on structural and full-text search conditions. We set out certain desirable principles for ranking schemes and propose natural ranking schemes that adhere to these principles. We develop efficient algorithms for answering top-K queries and discuss results from a comprehensive set of experiments that demonstrate the utility and scalability of the proposed framework and algorithms.
doi:10.1145/1007568.1007581 dblp:conf/sigmod/Amer-YahiaLP04 fatcat:kivqnzy7ibbn5enokrgilyp3ua