Efficient Dynamic Pruning with Proximity Support

Nicola Tonellotto, Craig Macdonald, Iadh Ounis
2010 Annual International ACM SIGIR Conference on Research and Development in Information Retrieval  
Modern retrieval approaches apply not just single-term weighting models when ranking documents -instead, proximity weighting models are in common use, which highly score the co-occurrence of pairs of query terms in close proximity to each other in documents. The adoption of these proximity weighting models can cause a computational overhead when documents are scored, negatively impacting the efficiency of the retrieval process. In this paper, we discuss the integration of proximity weighting
more » ... els into efficient dynamic pruning strategies. In particular, we propose to modify document-at-a-time strategies to include proximity scoring without any modifications to pre-existing index structures. Our resulting two-stage dynamic pruning strategies only consider single query terms during first stage pruning, but can early terminate the proximity scoring of a document if it can be shown that it will never be retrieved. We empirically examine the efficiency benefits of our approach using a large Web test collection of 50 million documents and 10,000 queries from a real query log. Our results show that our proposed two-stage dynamic pruning strategies are considerably more efficient than the original strategies, particularly for queries of 3 or more terms.
dblp:conf/sigir/TonellottoMO10 fatcat:a7ectv3ofnfuxdw7dd6vuisnty