Increasing evaluation sensitivity to diversity

Peter B. Golbus, Javed A. Aslam, Charles L. A. Clarke
2013 Information retrieval (Boston)  
Many queries have multiple interpretations; they are ambiguous or underspecified. This is especially true in the context of Web search. To account for this, much recent research has focused on creating systems that produce diverse ranked lists. In order to validate these systems, several new evaluation measures have been created to quantify diversity. Ideally, diversity evaluation measures would distinguish between systems by the amount of diversity in the ranked lists they produce.
more » ... y, diversity is also a function of the collection over which the system is run and a system's performance at adhoc retrieval. A ranked list built from a collection that does not cover multiple subtopics cannot be diversified; neither can a ranked list that contains no relevant documents. To ensure that we are assessing systems by their diversity, we develop (1) a family of evaluation measures that take into account the diversity of the collection and (2) a metaevaluation measure that explicitly controls for performance. We demonstrate experimentally that our new measures can achieve substantial improvements in sensitivity to diversity without reducing discriminative power.
doi:10.1007/s10791-012-9218-8 fatcat:5a3p7padabfmdp55ua2cpmz5sa