Routing of structured queries in large-scale distributed systems
Proceeding of the 2008 ACM workshop on Large-Scale distributed systems for information retrieval - LSDS-IR '08
In order to search XML-document collections, structural information -given by a user in the form of a structured query or provided by the self-describing structure of XML-documentshave been used in the past years to improve Information Retrieval (IR) quality in terms of recall and precision. However, all known approaches have only been used in classical client-/server (C/S) architectures. None have ever been applied to improve retrieval in large-scale distributed systems such as Peer-to-Peer
... P) networks, where efficiency issues have to be dealt with carefully, e.g. in order to reduce communication overhead between distributed nodes. As P2P networks can be considered promising alternatives to C/S-systems for storing large amounts of information including XML-documents, possibilities for improving the retrieval in such networks should be investigated. In this paper, we concentrate on query routing in such a scenario and raise the question, how structured queries can be routed in a highly distributed environment so as to increase both efficiency and effectiveness. We provide an infrastructure for investigating this question and propose techniques for performing routing based on a mixture of document-, element-, collection-and peerevidence. We also report on preliminary evaluation results with the INEX collection.