Evaluation Methodologies in Information Retrieval
Maristella Agosti, Norbert Fuhr, Elaine Toms, Pertti Vakkari, Maristella Agosti, Norbert Fuhr, Elaine Toms, Pertti Vakkari, Maristella Agosti, Norbert Fuhr, Elaine Toms, Pertti Vakkari
(+1 others)
Report from Dagstuhl Seminar
unpublished
This report documents the program and the outcome of Dagstuhl Seminar 13441 "Evaluation Methodologies in Information Retrieval", which brought together 42 participants from 11 countries. The seminar was motivated by the fact that today's information retrieval (IR) applications can hardly be evaluated based on the classic test collection paradigm, thus there is a need for new evaluation approaches. The event started with five introductory talks on evaluation frameworks , user modeling for
more »
... ion, evaluation criteria, measures, evaluation methodology, and new trends in IR evaluation. The seminar participants then formed working groups addressing specific aspects of IR evaluation, such as reliability and validity, task-based IR, learning as search outcome, searching for fun, IR and social media, graph search, domain-specific IR, interaction measures and models, and searcher-aware information access systems. License Creative Commons BY 3.0 Unported license © Maristella Agosti, Norbert Fuhr, Elaine Toms, and Pertti Vakkari Evaluation of information retrieval (IR) systems has a long tradition. However, the test-collection based evaluation paradigm is of limited value for assessing today's IR applications, since it fails to address major aspects of the IR process. Thus there is a need for new evaluation approaches, which was the focus of this seminar. Before the event, each participant was asked to identify one to five crucial issues in IR evaluation methodology. Pertti Vakkari presented a summary of this homework, pointing out that there are five major themes deemed relevant by the participants: 1) Evaluation frameworks, 2) Whole session evaluation and evaluation over sessions, 3) Evaluation criteria: from relevance to utility, 4) User modeling, and 5) Methodology and metrics.
fatcat:b7o273i4sfbhdkxj4l4mtjndz4