Filters








240 Hits in 2.1 sec

Evaluating the Potential of Explicit Phrases for Retrieval Quality [chapter]

Andreas Broschart, Klaus Berberich, Ralf Schenkel
2010 Lecture Notes in Computer Science  
This paper evaluates the potential impact of explicit phrases on retrieval quality through a case study with the TREC Terabyte benchmark.  ...  It compares the performance of user-and system-identified phrases with a standard score and a proximity-aware score, and shows that an optimal choice of phrases, including term permutations, can significantly  ...  Web, and the 150 topics from the TREC Terabyte AdHoc tracks 2004-2006 (topics 701-850).  ... 
doi:10.1007/978-3-642-12275-0_62 fatcat:ekd7wnkm7jbppiqy4lbubribri

A comparison of pooled and sampled relevance judgments

Ian Soboroff
2007 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '07  
This paper describes the approach taken in the TREC 2006 Terabyte Track: an initial shallow pool was judged to gather relevance information, which was then used to draw a random sample of further documents  ...  The sample judgments rank systems somewhat differently than the pool. Some analysis and plans for further research are discussed.  ...  The terabyte track has created a total of 149 adhoc search topics over the course of TRECs 2004-2006.  ... 
doi:10.1145/1277741.1277908 dblp:conf/sigir/Soboroff07 fatcat:nxpsjcdiwjhb5la4b6twdfp5tq

The TREC terabyte retrieval track

Charles Clarke, Nick Craswell, Ian Soboroff
2005 SIGIR Forum  
TREC 2005 is the second year for the track. The track was introduced as part of TREC 2004, with a single adhoc retrieval task. That year, 17 groups submitted 70 runs in total.  ...  The Terabyte Track explores how retrieval and evaluation techniques can scale to terabyte-sized collections, examining both efficiency and effectiveness issues.  ...  The members of the Search Engine Group at RMIT University helped in the creation of the named page topics for this year's Terabyte Track.  ... 
doi:10.1145/1067268.1067274 fatcat:mctp2mzjvjewlgga3k7svm5lvq

The Effect of Content-Equivalent Near-Duplicates on the Evaluation of Search Engines [chapter]

Maik Fröbe, Jan Philipp Bittner, Martin Potthast, Matthias Hagen
2020 Lecture Notes in Computer Science  
In this paper, we reproduce the aforementioned study and extend it to incorporate all TREC Terabyte, Web, and Core tracks.  ...  The worst-case penalty of having filtered duplicates in any of these tracks were losses between 8 and 53 ranks.  ...  But we observe the maximum drop of ranks (53, max I ) in the Terabyte Track 2006.  ... 
doi:10.1007/978-3-030-45442-5_2 fatcat:rkhgv7v7ajct5c5sqoeesoixbe

Using parsimonious language models on web data

Rianne Kaptein, Rongmei LI, Djoerd Hiemstra, Jaap Kamps
2008 Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '08  
We have conducted experiments on four TREC topic sets, and found that the parsimonious language model results in improvement of retrieval effectiveness over the standard language model for all data-sets  ...  In this paper we explore the use of parsimonious language models for web retrieval.  ...  EXPERIMENTS 3.1 Experimental Set-up We test our models on four TREC datasets, Web track TREC-8 (WT2g collection of 250K documents) and Terabyte tracks 2004, 2005 and 2006 (.GOV2 collection of 25M  ... 
doi:10.1145/1390334.1390491 dblp:conf/sigir/KapteinLHK08 fatcat:uthkx3c24vfvndilfbolwe5hi4

Supervised query modeling using wikipedia

Edgar Meij, Maarten de Rijke
2010 Proceeding of the 33rd international ACM SIGIR conference on Research and development in information retrieval - SIGIR '10  
To this end, we apply supervised machine learning to automatically link queries to Wikipedia articles and sample terms from the linked articles to re-estimate the query model.  ...  We use Wikipedia articles to semantically inform the generation of query models.  ...  For TREC Terabyte 2004-2006, we have 150 topics which are split equally. For TREC Web 2009 we have 50 topics and use five-fold cross validation.  ... 
doi:10.1145/1835449.1835660 dblp:conf/sigir/MeijR10 fatcat:ya4mi4qaezfurk7hwnsyovpc2m

Precision-at-ten considered redundant

William Webber, Alistair Moffat, Justin Zobel, Tetsuya Sakai
2008 Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '08  
In this paper, we demonstrate that complex metrics are as good as or better than simple metrics at predicting the performance of the simple metrics on other topics.  ...  EXPERIMENTAL DATA AND METHOD We use the submitted runs and relevance judgments from the AdHoc Track of TREC 8 and the Terabyte Track of TREC 2004.  ...  top 75% of TREC 2004 Terabyte Track systems. , tell a similar story for P@10.  ... 
doi:10.1145/1390334.1390456 dblp:conf/sigir/WebberMZS08 fatcat:7yske73zzbcl3nhiflnajgzrni

Incorporating term dependency in the dfr framework

Jie Peng, Craig Macdonald, Ben He, Vassilis Plachouras, Iadh Ounis
2007 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '07  
We evaluate our term dependency model on the two adhoc retrieval tasks using the TREC .GOV2 Terabyte collection.  ...  Furthermore, we examine the effect of varying the term dependency window size on the retrieval performance of the proposed model.  ...  We evaluate the proposed model in the context of the TREC 2005 and TREC 2006 Terabyte track adhoc tasks (Section 3).  ... 
doi:10.1145/1277741.1277937 dblp:conf/sigir/PengMHPO07 fatcat:2bscynxbhbcs7hetceswgxbmyy

Searching for expertise using the terrier platform

Craig Macdonald, Iadh Ounis
2006 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '06  
University of Glasgow at TREC 2005: Experiments in Terabyte and Enterprise tracks with Terrier. In TREC-2005 Proc. [2] M. Maybury, R.  ...  In the TREC 2005 Enterprise track, we developed an expert search system based on the Terrier IR platform [1] .  ...  In the TREC 2005 Enterprise track, we developed an expert search system based on the Terrier IR platform [1] .  ... 
doi:10.1145/1148170.1148345 dblp:conf/sigir/MacdonaldO06a fatcat:x7usodaflnguhirijhwdge3zda

A document-centric approach to static index pruning in text retrieval systems

Stefan Büttcher, Charles L. A. Clarke
2006 Proceedings of the 15th ACM international conference on Information and knowledge management - CIKM '06  
The decision is made based on the term's contribution to the document's Kullback-Leibler divergence from the text collection's global language model.  ...  Our technique can be used to decrease the size of the index by over 90%, at only a minor decrease in retrieval effectiveness.  ...  TREC Terabyte We compared the performance of our pruning method to other retrieval systems that participated in the efficiency task of the TREC 2005 Terabyte track (as reported by Clarke et al.  ... 
doi:10.1145/1183614.1183644 dblp:conf/cikm/ButtcherC06 fatcat:3hq2diemnbennie4kj2t4vb6mq

Bias and the limits of pooling

Chris Buckley, Darrin Dimmick, Ian Soboroff, Ellen Voorhees
2006 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '06  
The collections built in the the TREC terabyte track do not have an equivalent "smoking gun" run, but doubts regarding the viability of traditional pooling for documents sets in the terabyte range seem  ...  The titlestat rel values for the TREC 2004 and TREC 2005 terabyte collections are 0.889 and 0.898.  ... 
doi:10.1145/1148170.1148284 dblp:conf/sigir/BuckleyDSV06 fatcat:tr4kznfaxzfitf7btagfocjpzi

Improvements that don't add up

Timothy G. Armstrong, Alistair Moffat, William Webber, Justin Zobel
2009 Proceeding of the 18th ACM conference on Information and knowledge management - CIKM '09  
In this paper, we analyze results achieved on the TREC Ad-Hoc, Web, Terabyte, and Robust collections as reported in SIGIR (1998SIGIR ( -2008 and CIKM (2004CIKM ( -2008.  ...  And in only a handful of experiments is the score of the best TREC automatic run exceeded.  ...  This work was supported by the Australian Research Council. The inclusion and format of Table 1 was pertinently suggested by an anonymous referee.  ... 
doi:10.1145/1645953.1646031 dblp:conf/cikm/ArmstrongMWZ09 fatcat:zbfe7wnwjbhgnac4s4t55s5sgu

What makes a query difficult?

David Carmel, Elad Yom-Tov, Adam Darlow, Dan Pelleg
2006 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '06  
The three components of a topic are the textual expression describing the information need (the query or queries), the set of documents relevant to the topic (the Qrels), and the entire collection of documents  ...  In the absence of knowledge about one of the model components, the model is still useful by approximating the missing component based on the other components.  ...  The authors thank Shai Fine for his invaluable suggestions regarding the model and the JSD distance.  ... 
doi:10.1145/1148170.1148238 dblp:conf/sigir/CarmelYDP06 fatcat:k5y2rmgxbzarfb2pqdj2dxjaoq

Review of Methods in TREC from 1992 to 2014

Kalpana Khandale, Maheshkumar B., C. Namrata
2016 International Journal of Computer Applications  
This paper describes the overview of Text Retrieval conferences (TRECs) from 1992 (TREC-1) to 2014 (TREC-23).  ...  A brief comparative report about the methods, number of tracks and the outcomes are presented in this paper so that researchers working or want to work under this domain get an up to date view regarding  ...  curve Novelty Track QA Track Video Track Web Track CLIR Track Genome Track HARD Track Robust Track Terabyte Track Enterprise Track Spam Track Legal Track Blog Track [12] [13][14][15][16][17] Respectively  ... 
doi:10.5120/ijca2016907938 fatcat:ibtvcqhmbjcv7jjs4vgz7y2ium

A simple and efficient sampling method for estimating AP and NDCG

Emine Yilmaz, Evangelos Kanoulas, Javed A. Aslam
2008 Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '08  
We validate the proposed methods using TREC data and demonstrate that these new methods can be used to incorporate nonrandom samples, as were available in TREC Terabyte track '06.  ...  While the first method proposed by Aslam et al. [1] is quite accurate and efficient, it is overly complex, making it difficult to be used by the community, and while the second method proposed by Yilmaz  ...  First, let's briefly consider the sampling strategy used in TREC Terabyte 2006.  ... 
doi:10.1145/1390334.1390437 dblp:conf/sigir/YilmazKA08 fatcat:nmocdzridrektfvfnyacpnahey
« Previous Showing results 1 — 15 out of 240 results