A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is application/pdf
.
A document-centric approach to static index pruning in text retrieval systems
2006
Proceedings of the 15th ACM international conference on Information and knowledge management - CIKM '06
We present a static index pruning method, to be used in ad-hoc document retrieval tasks, that follows a documentcentric approach to decide whether a posting for a given term should remain in the index or not. The decision is made based on the term's contribution to the document's Kullback-Leibler divergence from the text collection's global language model. Our technique can be used to decrease the size of the index by over 90%, at only a minor decrease in retrieval effectiveness. It thus allows
doi:10.1145/1183614.1183644
dblp:conf/cikm/ButtcherC06
fatcat:3hq2diemnbennie4kj2t4vb6mq