Scaling IR-system evaluation using term relevance sets

Einat Amitay, David Carmel, Ronny Lempel, Aya Soffer
2004 Proceedings of the 27th annual international conference on Research and development in information retrieval - SIGIR '04  
This paper describes an evaluation method based on Term Relevance Sets (Trels) that measures an IR system's quality by examining the content of the retrieved results rather than by looking for pre-specified relevant pages. Trels consist of a list of terms believed to be relevant for a particular query as well as a list of irrelevant terms. The proposed method does not involve any document relevance judgments, and as such is not adversely affected by changes to the underlying collection.
more » ... e, it can better scale to very large, dynamic collections such as the Web. Moreover, this method can evaluate a system's effectiveness on an updatable "live" collection, or on collections derived from different data sources. Our experiments show that the proposed method is very highly correlated with official TREC measures.
doi:10.1145/1008992.1008997 dblp:conf/sigir/AmitayCLS04 fatcat:creiqewfyfc57jitoxrnlmzmee