A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is application/pdf
.
Filters
Evaluating the Potential of Explicit Phrases for Retrieval Quality
[chapter]
2010
Lecture Notes in Computer Science
This paper evaluates the potential impact of explicit phrases on retrieval quality through a case study with the TREC Terabyte benchmark. ...
It compares the performance of user-and system-identified phrases with a standard score and a proximity-aware score, and shows that an optimal choice of phrases, including term permutations, can significantly ...
Web, and the 150 topics from the TREC Terabyte AdHoc tracks 2004-2006 (topics 701-850). ...
doi:10.1007/978-3-642-12275-0_62
fatcat:ekd7wnkm7jbppiqy4lbubribri
A comparison of pooled and sampled relevance judgments
2007
Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '07
This paper describes the approach taken in the TREC 2006 Terabyte Track: an initial shallow pool was judged to gather relevance information, which was then used to draw a random sample of further documents ...
The sample judgments rank systems somewhat differently than the pool. Some analysis and plans for further research are discussed. ...
The terabyte track has created a total of 149 adhoc search topics over the course of TRECs 2004-2006. ...
doi:10.1145/1277741.1277908
dblp:conf/sigir/Soboroff07
fatcat:nxpsjcdiwjhb5la4b6twdfp5tq
The TREC terabyte retrieval track
2005
SIGIR Forum
TREC 2005 is the second year for the track. The track was introduced as part of TREC 2004, with a single adhoc retrieval task. That year, 17 groups submitted 70 runs in total. ...
The Terabyte Track explores how retrieval and evaluation techniques can scale to terabyte-sized collections, examining both efficiency and effectiveness issues. ...
The members of the Search Engine Group at RMIT University helped in the creation of the named page topics for this year's Terabyte Track. ...
doi:10.1145/1067268.1067274
fatcat:mctp2mzjvjewlgga3k7svm5lvq
The Effect of Content-Equivalent Near-Duplicates on the Evaluation of Search Engines
[chapter]
2020
Lecture Notes in Computer Science
In this paper, we reproduce the aforementioned study and extend it to incorporate all TREC Terabyte, Web, and Core tracks. ...
The worst-case penalty of having filtered duplicates in any of these tracks were losses between 8 and 53 ranks. ...
But we observe the maximum drop of ranks (53, max I ) in the Terabyte Track 2006. ...
doi:10.1007/978-3-030-45442-5_2
fatcat:rkhgv7v7ajct5c5sqoeesoixbe
Using parsimonious language models on web data
2008
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '08
We have conducted experiments on four TREC topic sets, and found that the parsimonious language model results in improvement of retrieval effectiveness over the standard language model for all data-sets ...
In this paper we explore the use of parsimonious language models for web retrieval. ...
EXPERIMENTS
3.1 Experimental Set-up
We test our models on four TREC datasets, Web track TREC-8
(WT2g collection of 250K documents) and Terabyte tracks 2004,
2005 and 2006 (.GOV2 collection of 25M ...
doi:10.1145/1390334.1390491
dblp:conf/sigir/KapteinLHK08
fatcat:uthkx3c24vfvndilfbolwe5hi4
Supervised query modeling using wikipedia
2010
Proceeding of the 33rd international ACM SIGIR conference on Research and development in information retrieval - SIGIR '10
To this end, we apply supervised machine learning to automatically link queries to Wikipedia articles and sample terms from the linked articles to re-estimate the query model. ...
We use Wikipedia articles to semantically inform the generation of query models. ...
For TREC Terabyte 2004-2006, we have 150 topics which are split equally. For TREC Web 2009 we have 50 topics and use five-fold cross validation. ...
doi:10.1145/1835449.1835660
dblp:conf/sigir/MeijR10
fatcat:ya4mi4qaezfurk7hwnsyovpc2m
Precision-at-ten considered redundant
2008
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '08
In this paper, we demonstrate that complex metrics are as good as or better than simple metrics at predicting the performance of the simple metrics on other topics. ...
EXPERIMENTAL DATA AND METHOD We use the submitted runs and relevance judgments from the AdHoc Track of TREC 8 and the Terabyte Track of TREC 2004. ...
top 75% of TREC 2004 Terabyte Track systems. , tell a similar story for P@10. ...
doi:10.1145/1390334.1390456
dblp:conf/sigir/WebberMZS08
fatcat:7yske73zzbcl3nhiflnajgzrni
Incorporating term dependency in the dfr framework
2007
Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '07
We evaluate our term dependency model on the two adhoc retrieval tasks using the TREC .GOV2 Terabyte collection. ...
Furthermore, we examine the effect of varying the term dependency window size on the retrieval performance of the proposed model. ...
We evaluate the proposed model in the context of the TREC 2005 and TREC 2006 Terabyte track adhoc tasks (Section 3). ...
doi:10.1145/1277741.1277937
dblp:conf/sigir/PengMHPO07
fatcat:2bscynxbhbcs7hetceswgxbmyy
Searching for expertise using the terrier platform
2006
Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '06
University of Glasgow at TREC 2005: Experiments in Terabyte and Enterprise tracks with Terrier. In TREC-2005 Proc. [2] M. Maybury, R. ...
In the TREC 2005 Enterprise track, we developed an expert search system based on the Terrier IR platform [1] . ...
In the TREC 2005 Enterprise track, we developed an expert search system based on the Terrier IR platform [1] . ...
doi:10.1145/1148170.1148345
dblp:conf/sigir/MacdonaldO06a
fatcat:x7usodaflnguhirijhwdge3zda
A document-centric approach to static index pruning in text retrieval systems
2006
Proceedings of the 15th ACM international conference on Information and knowledge management - CIKM '06
The decision is made based on the term's contribution to the document's Kullback-Leibler divergence from the text collection's global language model. ...
Our technique can be used to decrease the size of the index by over 90%, at only a minor decrease in retrieval effectiveness. ...
TREC Terabyte We compared the performance of our pruning method to other retrieval systems that participated in the efficiency task of the TREC 2005 Terabyte track (as reported by Clarke et al. ...
doi:10.1145/1183614.1183644
dblp:conf/cikm/ButtcherC06
fatcat:3hq2diemnbennie4kj2t4vb6mq
Bias and the limits of pooling
2006
Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '06
The collections built in the the TREC terabyte track do not have an equivalent "smoking gun" run, but doubts regarding the viability of traditional pooling for documents sets in the terabyte range seem ...
The titlestat rel values for the TREC 2004 and TREC 2005 terabyte collections are 0.889 and 0.898. ...
doi:10.1145/1148170.1148284
dblp:conf/sigir/BuckleyDSV06
fatcat:tr4kznfaxzfitf7btagfocjpzi
Improvements that don't add up
2009
Proceeding of the 18th ACM conference on Information and knowledge management - CIKM '09
In this paper, we analyze results achieved on the TREC Ad-Hoc, Web, Terabyte, and Robust collections as reported in SIGIR (1998SIGIR ( -2008 and CIKM (2004CIKM ( -2008. ...
And in only a handful of experiments is the score of the best TREC automatic run exceeded. ...
This work was supported by the Australian Research Council. The inclusion and format of Table 1 was pertinently suggested by an anonymous referee. ...
doi:10.1145/1645953.1646031
dblp:conf/cikm/ArmstrongMWZ09
fatcat:zbfe7wnwjbhgnac4s4t55s5sgu
What makes a query difficult?
2006
Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '06
The three components of a topic are the textual expression describing the information need (the query or queries), the set of documents relevant to the topic (the Qrels), and the entire collection of documents ...
In the absence of knowledge about one of the model components, the model is still useful by approximating the missing component based on the other components. ...
The authors thank Shai Fine for his invaluable suggestions regarding the model and the JSD distance. ...
doi:10.1145/1148170.1148238
dblp:conf/sigir/CarmelYDP06
fatcat:k5y2rmgxbzarfb2pqdj2dxjaoq
Review of Methods in TREC from 1992 to 2014
2016
International Journal of Computer Applications
This paper describes the overview of Text Retrieval conferences (TRECs) from 1992 (TREC-1) to 2014 (TREC-23). ...
A brief comparative report about the methods, number of tracks and the outcomes are presented in this paper so that researchers working or want to work under this domain get an up to date view regarding ...
curve Novelty Track QA Track Video Track Web Track CLIR Track Genome Track HARD Track Robust Track Terabyte Track Enterprise Track Spam Track Legal Track Blog Track [12]
[13][14][15][16][17] Respectively ...
doi:10.5120/ijca2016907938
fatcat:ibtvcqhmbjcv7jjs4vgz7y2ium
A simple and efficient sampling method for estimating AP and NDCG
2008
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '08
We validate the proposed methods using TREC data and demonstrate that these new methods can be used to incorporate nonrandom samples, as were available in TREC Terabyte track '06. ...
While the first method proposed by Aslam et al. [1] is quite accurate and efficient, it is overly complex, making it difficult to be used by the community, and while the second method proposed by Yilmaz ...
First, let's briefly consider the sampling strategy used in TREC Terabyte 2006. ...
doi:10.1145/1390334.1390437
dblp:conf/sigir/YilmazKA08
fatcat:nmocdzridrektfvfnyacpnahey
« Previous
Showing results 1 — 15 out of 240 results