Filters








9,804 Hits in 7.3 sec

RMITB at TREC COVID 2020 [article]

Rodger Benham, Alistair Moffat, J. Shane Culpepper
2020 arXiv   pre-print
In our analysis, we focus primarily on the effects of having our second priority run omitted from the judgment pool.  ...  These user query variations have been exploited in past TREC CORE tracks to contribute diverse, highly-effective runs in offline evaluation campaigns with the goal of producing reusable test collections  ...  Code Code and query variations to reproduce experiments are available at https://github.com/ rmit-ir/rmitb-trec-covid.  ... 
arXiv:2011.04830v1 fatcat:nyy5uc6mubhd3foidnmfatquvy

Variations in relevance assessments and the measurement of retrieval effectiveness

Stephen P. Harter
1996 Journal of the American Society for Information Science  
The purpose of this article is to bring attention to the problem of variations in relevance assessments and the effects that these may have on measures of retrieval effectiveness.  ...  Through an analytical review of the literature, I show that despite known wide variations in relevance assessments in experimental test collections, their effects on the measurement of retrieval performance  ...  Relevance judgments may vary according to how a given citation affects a user's conceptualization of the information problem: How it may cause cognitive change by shedding new light on the problem, by  ... 
doi:10.1002/(sici)1097-4571(199601)47:1<37::aid-asi4>3.3.co;2-i fatcat:cckshv2tw5arbp2wqdzdo7kyea

Variations in relevance assessments and the measurement of retrieval effectiveness

Stephen P. Harter
1996 Journal of the American Society for Information Science  
The purpose of this article is to bring attention to the problem of variations in relevance assessments and the effects that these may have on measures of retrieval effectiveness.  ...  Through an analytical review of the literature, I show that despite known wide variations in relevance assessments in experimental test collections, their effects on the measurement of retrieval performance  ...  Relevance judgments may vary according to how a given citation affects a user's conceptualization of the information problem: How it may cause cognitive change by shedding new light on the problem, by  ... 
doi:10.1002/(sici)1097-4571(199601)47:1<37::aid-asi4>3.0.co;2-3 fatcat:cmjlyyyg65cvvlv6kpnaxy6lrq

Evaluating search systems using result page context

Peter Bailey, Nick Craswell, Ryen W. White, Liwei Chen, Ashwin Satyanarayana, S. M.M. Tahaghoghi
2010 Proceeding of the third symposium on Information interaction in context - IIiX '10  
We also study possible issues with applying the method, including brand presentation effects, inter-judge agreement, and comparisons with document-based relevance judgments.  ...  Contrary to Cranfield-style evaluation methods, our approach recognizes that a user"s initial search interaction is with the result page produced by a search system, not the landing pages linked from it  ...  Of concern in doing so are potential variations arising from presentation effects.  ... 
doi:10.1145/1840784.1840801 dblp:conf/iiix/BaileyCWCST10 fatcat:6zlxguk4vffvvi2rvd3j53wise

The TREC-2001 Cross-Language Information Retrieval Track: Searching Arabic Using English, French or Arabic Queries

Fredric C. Gey, Douglas W. Oard
2001 Text Retrieval Conference  
This raises some concern that the relevance judgment pools may be less complete than has historically been the case.  ...  French and Arabic translations of the queries were also available.  ...  As is common in information retrieval evaluations, substantial variation was observed in retrieval effectiveness on a topic-by-topic basis.  ... 
dblp:conf/trec/GeyO01 fatcat:cwkg2mal3zgrpewhdtyo333tea

Relevance judgements for assessing recall

Peter Wallis, James A. Thom
1996 Information Processing & Management  
The problem is demonstrated by comparing two information retrieval methods over several queries, and showing how a new method of forming relevance judgments that is suitable for assessing recall gives  ...  To compare performance of different systems, standard collections of documents, queries, and relevance judgments have been used.  ...  Keyword retrieval of documents written in a restricted language, by queries written in the same restricted language could significantly reduce the number of misses caused by variation in language use.  ... 
doi:10.1016/0306-4573(95)00061-5 fatcat:snrjbuezfbcj7ctb33hl54ozvi

Selecting a Subset of Queries for Acquisition of Further Relevance Judgements [chapter]

Mehdi Hosseini, Ingemar J. Cox, Natasa Milic-Frayling, Vishwa Vinay, Trevor Sweeting
2011 Lecture Notes in Computer Science  
From the documents retrieved by the new systems we create a pool of unjudged documents.  ...  Rather than uniformly distributing the budget across all queries, we first select a subset of queries that are effective in evaluating systems and then uniformly allocate the budget only across these queries  ...  However, since documents are provided by systems that are being compared, the resulting document pool is expected to be effective in assessing their relative performance [14] .  ... 
doi:10.1007/978-3-642-23318-0_12 fatcat:sowvic7omjhsfgngwt2fxyhyry

Evaluation effort, reliability and reusability in XML retrieval

Sukomal Pal, Mandar Mitra, Jaap Kamps
2010 Journal of the American Society for Information Science and Technology  
However, when judging only a random sample of a pool, it is better to completely judge fewer topics than to partially judge many topics. This result confirms the effectiveness of pooling methods.  ...  Finally, they observe that for a fixed amount of effort, judging shallow pools for many queries is better than judging deep pools for a smaller set of queries.  ...  Thus, unlike in random pool sampling, the query contributes to the precision scores of all systems uniformly; the reduction in τ is caused by the variation of system performance across topics.  ... 
doi:10.1002/asi.21403 fatcat:zusa7ro7brgwfpnnkxpvu3degq

Rank-biased precision for measurement of retrieval effectiveness

Alistair Moffat, Justin Zobel
2008 ACM Transactions on Information Systems  
Rank-biased precision for measurement of retrieval effectiveness.  ...  These are typically intended to provide a quantitative single-value summary of a document ranking relative to a query. However, many of these measures have failings.  ...  In recent work, Webber et al. [2008] explored this point, and described a standardization approach that removes the bias caused by query variation.  ... 
doi:10.1145/1416950.1416952 fatcat:qpe7245dgfelvn5hwnjrjyuiuq

Variations in relevance judgments and the measurement of retrieval effectiveness

Ellen M. Voorhees
2000 Information Processing & Management  
Test collections have traditionally been used by information retrieval researchers to improve their retrieval strategies.  ...  Very high correlations were found among the rankings of systems produced using different relevance judgment sets.  ...  The analysis of the effect of averaging, including creating Figure 5 , was performed by Paul Over of NIST.  ... 
doi:10.1016/s0306-4573(00)00010-8 fatcat:elp7u7k3fjhqpiynhjce7heneq

Variations in relevance judgments and the measurement of retrieval effectiveness

Ellen M. Voorhees
1998 Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '98  
Test collections have traditionally been used by information retrieval researchers to improve their retrieval strategies.  ...  Very high correlations were found among the rankings of systems produced using different relevance judgment sets.  ...  The analysis of the effect of averaging, including creating Figure 5 , was performed by Paul Over of NIST.  ... 
doi:10.1145/290941.291017 dblp:conf/sigir/Voorhees98 fatcat:fbbz34opevfgpi3dyafo3ld6xi

Overview of the TREC 2006 Enterprise Track

Ian Soboroff, Arjen P. de Vries, Nick Craswell
2006 Text Retrieval Conference  
Reducing pool size As indicated above, the inclusion of support documents for experts caused the pools to be very large.  ...  Since the expert judgments were presumably informed by the supporting documents, we could not just apply the original expert judgment in the reduced pools.  ... 
dblp:conf/trec/SoboroffVC06 fatcat:5idgz226fjd5dntnbnjkbh5k34

Cross-Language Retrieval at the University of Twente and TNO [chapter]

Dennis Reidsma, Djoerd Hiemstra, Franciska de Jong, Wessel Kraaij
2003 Lecture Notes in Computer Science  
The goal of the experiment was to examine possible influences on the assessments caused by the use of highlighting in the assessment program. 3 The Twente/TNO group used to participate in earlier CLEF-events  ...  Twenty-One was an information retrieval project funded by the TAP programme of the EU. The project was completed in June 1999.  ...  The expected effect of submitting a run for which the queries were manually created from the topics was an increase in the size and quality of the pool of documents to be assessed.  ... 
doi:10.1007/978-3-540-45237-9_16 fatcat:clowb6vrtfhqzllefjppz3jacq

Estimating Pool-depth on Per Query Basis

Sukomal Pal, Mandar Mitra, Samaresh Maiti
2010 NTCIR Conference on Evaluation of Information Access Technologies  
Instead of using an apriori-fixed depth, variable pool-depth based pooling is adopted. The pool for each topic is incrementally built and judged interactively.  ...  When no new relevant document is found for a reasonably long run of pool-depths, pooling can be stopped for the topic.  ...  Again, for the queries where the rate of finding new reldocs is quite high, better estimates of recall can be ensured by going deeper in the pool (k > 100).  ... 
dblp:conf/ntcir/PalMM10 fatcat:5xcjycnzmzgdnpxrgwy2zqmfle

Bias and the limits of pooling for large collections

Chris Buckley, Darrin Dimmick, Ian Soboroff, Ellen Voorhees
2007 Information retrieval (Boston)  
This paper shows that the judgment sets produced by traditional pooling when the pools are too small relative to the total document set size can be biased in that they favor relevant documents that contain  ...  The idea behind pooling is to find enough relevant documents such that when unjudged documents are assumed to be nonrelevant the resulting judgment set is sufficiently complete and unbiased.  ...  Given that the change in the frequency of topic title words occurring in relevant documents did not happen by chance, what was the cause?  ... 
doi:10.1007/s10791-007-9032-x fatcat:ocsix63ddzan7bgxoik3xj5dti
« Previous Showing results 1 — 15 out of 9,804 results