4,126 Hits in 4.4 sec

On Divergence Measures and Static Index Pruning

Ruey-Cheng Chen, Chia-Jung Lee, W. Bruce Croft
2015 Proceedings of the 2015 International Conference on Theory of Information Retrieval - ICTIR '15  
We study the problem of static index pruning in a renowned divergence minimization framework, using a range of divergence measures such as f -divergence and Rényi divergence as the objective.  ...  Our approach allows postings be prioritized according to the amount of information they contribute to the index, and through specifying a different divergence measure the contribution is modeled on a different  ...  CONCLUSIONS In this paper, we provide a thorough study on a wide range of divergence measures and their use on static index pruning.  ... 
doi:10.1145/2808194.2809472 dblp:conf/ictir/ChenLC15 fatcat:ivyasgpjqfaeljnjlnygliblqq

An Empirical Analysis of Pruning Techniques

Ruey-Cheng Chen, Leif Azzopardi, Falk Scholer
2017 Proceedings of the 2017 ACM on Conference on Information and Knowledge Management - CIKM '17  
In this paper, we investigate how the retrieval bias of a system changes as the inverted index is optimized for e ciency through static index pruning.  ...  In our analysis, we consider four pruning methods and examine how they a ect performance and bias on the TREC GOV2 Collection.  ...  BACKGROUND Research on index pruning can be divided into two areas, called static and dynamic index pruning, based on when and how the pruning is performed.  ... 
doi:10.1145/3132847.3133151 dblp:conf/cikm/ChenAS17 fatcat:6za4i2g635cdzgweizw53yzuwu

An information-theoretic account of static index pruning

Ruey-Cheng Chen, Chia-Jung Lee
2013 Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval - SIGIR '13  
We show that static index pruning has an approximate analytical solution in the form of convex integer program.  ...  In this paper, we recast static index pruning as a model induction problem under the framework of Kullback's principle of minimum cross-entropy.  ...  Kullback's solution was simple and elegant: One shall choose a measure that most closely resembles the previous measurement in terms of Kullback-Leibler divergence.  ... 
doi:10.1145/2484028.2484061 dblp:conf/sigir/ChenL13 fatcat:nlnsfib4rfhv5nhe447mrcalya

Index Pruning and Result Reranking: Effects on Ad-Hoc Retrieval and Named Page Finding

Stefan Büttcher, Charles L. A. Clarke, Peter C. K. Yeung
2006 Text Retrieval Conference  
Our experiments are centered around two concepts: Static index pruning (for increased retrieval efficiency) and result reranking (for improved precision).  ...  We show that index pruning and reranking based on relevance models can be beneficial in an ad-hoc retrieval setting, but have a disastrous repercussion on the effectiveness of named page finding.  ...  STATIC INDEX PRUNING The notion of static index pruning was officially introduced by Carmel et al. [6] .  ... 
dblp:conf/trec/ButtcherCY06 fatcat:fyjtwwj33zdy3nkcmvzg5gubn4

A Practitioner's Guide for Static Index Pruning [chapter]

Ismail Sengor Altingovde, Rifat Ozcan, Özgür Ulusoy
2009 Lecture Notes in Computer Science  
We compare the term-and document-centric static index pruning approaches as described in the literature and investigate their sensitivity to the scoring functions employed during the pruning and actual  ...  Static Inverted Index Pruning Static index pruning permanently removes some information from the index, for the purposes of utilizing the disk space and improving query processing efficiency.  ...  Conclusion In this study, we compare the performance of TCP and DCP algorithms on the static pruning of the entire index files, and show that the former performs better for our dataset and experimental  ... 
doi:10.1007/978-3-642-00958-7_65 fatcat:7vlkvyyzxnfqfowtfesuuaacty

A document-centric approach to static index pruning in text retrieval systems

Stefan Büttcher, Charles L. A. Clarke
2006 Proceedings of the 15th ACM international conference on Information and knowledge management - CIKM '06  
This results in great efficiency gains, superior to those of earlier pruning methods, and an average response time around 20 ms on the GOV2 document collection.  ...  We present a static index pruning method, to be used in ad-hoc document retrieval tasks, that follows a documentcentric approach to decide whether a posting for a given term should remain in the index  ...  [6] broke away from the dynamic pruning paradigm and introduced the concept of static index pruning to information retrieval.  ... 
doi:10.1145/1183614.1183644 dblp:conf/cikm/ButtcherC06 fatcat:3hq2diemnbennie4kj2t4vb6mq

Information preservation in static index pruning

Ruey-Cheng Chen, Chia-Jung Lee, Chiung-Min Tsai, Jieh Hsiang
2012 Proceedings of the 21st ACM international conference on Information and knowledge management - CIKM '12  
We develop a new static index pruning criterion based on the notion of information preservation.  ...  We model this loss in predictive power using conditional entropy and show that the decision in static index pruning can therefore be optimized to preserve information as much as possible.  ...  In this paper, we discuss the idea of information preservation and use that to motivate a new decision measure for static index pruning.  ... 
doi:10.1145/2396761.2398673 dblp:conf/cikm/ChenLTH12 fatcat:eappm2ofkvb3rirhcdmseqyvwe

Within-Document Term-Based Index Pruning with Statistical Hypothesis Testing [chapter]

Sree Lekha Thota, Ben Carterette
2011 Lecture Notes in Computer Science  
Document-centric static index pruning methods provide smaller indexes and faster query times by dropping some within-document term information from inverted lists.  ...  Experimental results show that this technique can be used to significantly decrease the size of the index and querying speed with less compromise to retrieval effectiveness than similar heuristic methods  ...  Static Index Pruning Using the 2N2P Test We will use the above described statistical method to make pruning decisions.  ... 
doi:10.1007/978-3-642-20161-5_54 fatcat:s27nko6ravhdlm2r23bbharfsy

XML Retrieval Using Pruned Element-Index Files [chapter]

Ismail Sengor Altingovde, Duygu Atilgan, Özgür Ulusoy
2010 Lecture Notes in Computer Science  
A direct index, on the other hand, only indexes the content that is directly under each element and disregards the descendants.  ...  In this paper, we propose using static index pruning techniques for obtaining more compact index files that can still result in comparable retrieval performance to that of a full index.  ...  This work is supported by The Scientific and Technological Research Council of Turkey (TÜBİTAK) by the grant number 108E008.  ... 
doi:10.1007/978-3-642-12275-0_28 fatcat:5uteu4za5zhxlog7zp5gibvq34

Exploiting query views for static index pruning in web search engines

Ismail Sengor Altingovde, Rifat Ozcan, Özgür Ulusoy
2009 Proceeding of the 18th ACM conference on Information and knowledge management - CIKM '09  
We propose incorporating query views in a number of static pruning strategies, namely term-centric, document-centric and access-based approaches.  ...  These query-view based strategies considerably outperform their counterparts for both disjunctive and conjunctive query processing in Web search engines.  ...  INTRODUCTION Static index pruning techniques permanently remove a presumably redundant part of an inverted file, to reduce the file size and query processing time.  ... 
doi:10.1145/1645953.1646273 dblp:conf/cikm/AltingovdeOU09 fatcat:wbgpejlaqrhhzefqaqg5zpcoi4

Diversification Based Static Index Pruning - Application to Temporal Collections [article]

Zeynep Pehlivan and Benjamin Piwowarski and Stéphane Gançarski
2013 arXiv   pre-print
Decreasing the index size is a direct way to decrease this query response time. Static index pruning methods reduce the size of indexes by removing a part of the postings.  ...  None of the existing pruning approaches take (temporal) diversification into account. In this paper, we propose a diversification-based static index pruning method.  ...  On the other hand, static pruning reduces the index size by removing postings from the index offline, independently from any query.  ... 
arXiv:1308.4839v1 fatcat:6zs55gvrqrfsljqnzrcttaopxe

Exploiting Index Pruning Methods for Clustering XML Collections [chapter]

Ismail Sengor Altingovde, Duygu Atilgan, Özgür Ulusoy
2010 Lecture Notes in Computer Science  
Next, we apply index pruning techniques from the literature to reduce the size of the document vectors.  ...  Our experiments show that for certain cases, it is possible to prune up to 70% of the collection (or, more specifically, underlying document vectors) and still generate a clustering structure that yields  ...  Employing Pruning Strategies for Clustering From the previous works, it is known that static index pruning techniques can reduce the size of an index (and the underlying collection) while providing comparative  ... 
doi:10.1007/978-3-642-14556-8_37 fatcat:ytr4axrg5nfexerpgwdar2h3wm

Static Index Pruning for Information Retrieval Systems: A Posting-Based Approach

Linh Thai Nguyen
2009 Annual International ACM SIGIR Conference on Research and Development in Information Retrieval  
Static index pruning methods have been proposed to reduce size of the inverted index of information retrieval systems.  ...  This approach ranks all postings and keeps only a subset of top ranked ones.  ...  CONCLUSIONS AND FUTURE WORK We evaluate document-centric and term-centric static index pruning based on the WT10G corpus and TREC query sets.  ... 
dblp:conf/sigir/Nguyen09 fatcat:iddt2wjftzfqjopgyfroqyxaxe

Dynamic Materialization of Query Views for Data Warehouse Workloads

Thomas Phan, Wen-Syan Li
2008 2008 IEEE 24th International Conference on Data Engineering  
solution space, and to avoid materializing seldom-used MQTs, we prune the set of MQT candidates.  ...  In this paper we present an automated, dynamic MQT management scheme that materializes views and creates indexes in an on-demand fashion as a workload executes and manages them with an LRU cache.  ...  Note that we use the term MQT to refer collectively to materialized views, materialized views and their indexes, and indexes on base tables.  ... 
doi:10.1109/icde.2008.4497452 dblp:conf/icde/PhanL08 fatcat:pwc2lafz25esxlb32qf5hszllm

Index ordering by query-independent measures

Paul Ferguson, Alan F. Smeaton
2012 Information Processing & Management  
Our experiments, carried out on the TREC Terabyte collection, report significant savings, in terms of number of postings examined, without significant loss of effectiveness when based on several measures  ...  of importance used in isolation, and in combination.  ...  Acknowledgments This work was funded by Science Foundation Ireland as part of the CLARITY CSET, under grant numbers 03/IN.3/I361 and 07/CE/I1147.  ... 
doi:10.1016/j.ipm.2011.10.003 fatcat:t35xsxhvejecpfvelqxpmm37fa
« Previous Showing results 1 — 15 out of 4,126 results