A nugget-based test collection construction paradigm

Shahzad Rajput, Virgil Pavlu, Peter B. Golbus, Javed A. Aslam
2011 Proceedings of the 20th ACM international conference on Information and knowledge management - CIKM '11  
The problem of building test collections is central to the development of information retrieval systems such as search engines. Starting with a few relevant "nuggets" of information manually extracted from existing TREC corpora, we implement and test a methodology that finds and correctly assesses the vast majority of relevant documents found by TREC assessors-as well as up to four times more additional relevant documents. Our methodology produces highly accurate test collections that hold the
more » ... romise of addressing the issues of scalability, reusability, and applicability.
doi:10.1145/2063576.2063861 dblp:conf/cikm/RajputPGA11 fatcat:cj7jsyqi7jfyfom6wsxnvggqb4