Filters








5,143 Hits in 4.6 sec

Repeatable and reliable search system evaluation using crowdsourcing

Roi Blanco, Harry Halpin, Daniel M. Herzig, Peter Mika, Jeffrey Pound, Henry S. Thompson, Thanh Tran Duc
2011 Proceedings of the 34th international ACM SIGIR conference on Research and development in Information - SIGIR '11  
Using the first large-scale evaluation campaign that specifically targets the task of ad-hoc Web object retrieval over a number of deployed systems, we demonstrate that crowd-sourced evaluation campaigns  ...  To demonstrate, we investigate creating an evaluation campaign for the semantic search task of keyword-based ad-hoc object retrieval.  ...  over long periods of time, a necessary feature for running large-scale campaigns for novel information retrieval tasks on an annual basis.  ... 
doi:10.1145/2009916.2010039 dblp:conf/sigir/BlancoHHMPTT11 fatcat:3v4ewk3rmrek7acjm6tbgjtijq

Applying human computation mechanisms to information retrieval

Christopher G. Harris, Padmini Srinivasan
2012 Proceedings of the American Society for Information Science and Technology  
Keywords Crowdsourcing, information retrieval, GWAP, games with a purpose, human computation This is the space reserved for copyright notices.  ...  Despite this increased attention, much of this transformation has been limited to a few aspects of Information Retrieval (IR). In this paper, we examine these two mechanisms' applicability to IR.  ...  The Value-Added Effects of Human Mechanisms Our initial criterion (Criterion 1), which examined if each step in our IR model could scale using the crowd and GWAP, was evaluated based on turning the entire  ... 
doi:10.1002/meet.14504901050 fatcat:swqwsrgcwrbffm45dexqlaywzu

QUALITY OF CROWDSOURCED RELEVANCE JUDGMENTS IN ASSOCIATION WITH LOGICAL REASONING ABILITY

Sri Devi Ravana, Parnia Samimi, Prabha Rajagopal
2018 Malaysian Journal of Computer Science  
It is important to determine the attributes that could affect the effectiveness of crowsourced-judgments in an information retrieval systems evaluation.  ...  The study also evaluates the effect of cognitive characteristics on the quality of relevance judgment compared to the gold standard dataset.  ...  Does logical reasoning ability effect the information retrieval systems' rankings in information retrieval evaluation? 3.  ... 
doi:10.22452/mjcs.sp2018no1.6 fatcat:n7jkas5shrafnnluid523qzvxi

Repeatable and reliable semantic search evaluation

Roi Blanco, Harry Halpin, Daniel M. Herzig, Peter Mika, Jeffrey Pound, Henry S. Thompson, Thanh Tran
2013 Journal of Web Semantics  
In this paper, we present an evaluation framework for semantic search, analyze the framework with regard to repeatability and reliability, and report on our experiences on applying it in the Semantic Search  ...  These solutions exploit the explicit semantics captured in structured data such as RDF for enhancing document representation and retrieval, or for finding answers by directly searching over the data.  ...  Another goal of our work is to demonstrate the use of crowdsourcing for a large-scale evaluation campaign for a novel search task, which in our case is adhoc object retrieval over RDF.  ... 
doi:10.1016/j.websem.2013.05.005 fatcat:rbh52sc6iveonnwxzsfmktb7ny

Repeatable and Reliable Semantic Search Evaluation

Roi Blanco, Harry Halpin, Daniel M. Herzig, Peter Mika, Jeffrey Pound, Henry S Thompson, Thanh Tran
2013 Social Science Research Network  
In this paper, we present an evaluation framework for semantic search, analyze the framework with regard to repeatability and reliability, and report on our experiences on applying it in the Semantic Search  ...  These solutions exploit the explicit semantics captured in structured data such as RDF for enhancing document representation and retrieval, or for finding answers by directly searching over the data.  ...  Another goal of our work is to demonstrate the use of crowdsourcing for a large-scale evaluation campaign for a novel search task, which in our case is adhoc object retrieval over RDF.  ... 
doi:10.2139/ssrn.3199069 fatcat:da2hqnm4cvbv7jonfw7kuogyia

Augmented Test Collections: A Step in the Right Direction [article]

Laura Hasler, Martin Halvey, Robert Villa
2015 arXiv   pre-print
In this position paper we argue that certain aspects of relevance assessment in the evaluation of IR systems are oversimplified and that human assessments represented by qrels should be augmented to take  ...  We propose enhancing test collections used in evaluation with information related to human assessors and their interpretation of the task.  ...  A next step would be to investigate whether we can gather a substantial amount of comparable assessments and corresponding additional information effectively on a larger scale, using, for example, crowdsourcing  ... 
arXiv:1501.06370v1 fatcat:he46ekn32zeppi7un5yz2gnovy

Creation of Reliable Relevance Judgments in Information Retrieval Systems Evaluation Experimentation through Crowdsourcing: A Review

Parnia Samimi, Sri Devi Ravana
2014 The Scientific World Journal  
Test collection is used to evaluate the information retrieval systems in laboratory-based evaluation experimentation.  ...  One of the crowdsourcing applications in IR is to judge relevancy of query document pair.  ...  Conflict of Interests The authors declare that there is no conflict of interests regarding the publication of this paper. Acknowledgment  ... 
doi:10.1155/2014/135641 pmid:24977172 pmcid:PMC4055211 fatcat:qfbavfc45jfmzp4yisqyedx2o4

Assessing geographic relevance for mobile search: A computational model and its validation via crowdsourcing

Tumasch Reichenbacher, Stefano De Sabbata, Ross S. Purves, Sara I. Fabrikant
2016 Journal of the Association for Information Science and Technology  
Abstract The selection and retrieval of relevant information from the information universe on the web is becoming increasingly important in addressing information overload.  ...  To determine the effectiveness and validity of these methods, we evaluate them through a user study conducted on the Amazon Mechanical Turk crowdsourcing platform.  ...  Acknowledgments The presented work is part of the project 'Geographic Relevance in Mobile Applications' funded by the Swiss National Science Foundation (Project 200021_119819 / 1).  ... 
doi:10.1002/asi.23625 fatcat:aps4go2jfzehfo43oqdvektouq

Crowdsourcing for book search evaluation

Gabriella Kazai, Jaap Kamps, Marijn Koolen, Natasa Milic-Frayling
2011 Proceedings of the 34th international ACM SIGIR conference on Research and development in Information - SIGIR '11  
We assess the output in terms of label agreement with a gold standard data set and observe the effect of the crowdsourced relevance judgments on the resulting system rankings.  ...  This enables us to observe the effect of crowdsourcing on the entire IR evaluation process.  ...  We detail the design of a large-scale experiment in Section 3, aimed to study the effects of design decisions in crowdsourcing.  ... 
doi:10.1145/2009916.2009947 dblp:conf/sigir/KazaiKKM11 fatcat:5ekvpblhrfdehkmey76nkpjlbq

Low cost evaluation in information retrieval

Ben Carterette, Evangelos Kanoulas, Emine Yilmaz
2010 Proceeding of the 33rd international ACM SIGIR conference on Research and development in information retrieval - SIGIR '10  
They require substantially more cost in human assessments for the same reliability in evaluation; if the additional cost goes over the assessing budget, errors in evaluation are inevitable.  ...  Most of her current work involves evaluation of retrieval systems, the effect of evaluation metrics on learning to rank problems and modeling user behavior.  ... 
doi:10.1145/1835449.1835675 dblp:conf/sigir/CarteretteKY10 fatcat:wp4bkff3g5cbpprqupyy5stmeu

Crowdsourcing for relevance evaluation

Omar Alonso, Daniel E. Rose, Benjamin Stewart
2008 SIGIR Forum  
Relevance evaluation is an essential part of the development and maintenance of information retrieval systems.  ...  We describe a new approach to evaluation called TERC, based on the crowdsourcing paradigm, in which many online users, drawn from a large community, each performs a small evaluation task.  ...  Introduction Relevance evaluation for information retrieval is a notoriously difficult and expensive task.  ... 
doi:10.1145/1480506.1480508 fatcat:oukvqgjcpfhlve6ycdpsoitq2y

Overview of EIREX 2011: Crowdsourcing [article]

Julián Urbano, Diego Martín, Mónica Marrero, Jorge Morato
2012 arXiv   pre-print
For information on other editions of EIREX and related data, see the website at http://ir.kr.inf.uc3m.es/eirex/.  ...  This overview paper summarizes the results of the EIREX 2011 track, focusing on the creation of the test collection and the analysis to assess its reliability.  ...  For this second EIREX edition we chose the theme to be Crowdsourcing, as it is a topic of interest for Information Retrieval.  ... 
arXiv:1203.0518v1 fatcat:n3crj7e2uvbytejvw2wh4yqhdu

Identifying Useful and Important Information within Retrieved Documents

Piyush Arora, Gareth J.F. Jones
2017 Proceedings of the 2017 Conference on Conference Human Information Interaction and Retrieval - CHIIR '17  
We report three user studies using a crowdsourcing platform, where participants were first asked to read an information need and contents of a relevant document and then to perform actions depending on  ...  We describe an initial study into the identification of important and useful information units within documents retrieved by an information retrieval system in response to a user query created in response  ...  Acknowledgment: This research is supported by Science Foundation Ireland (SFI) as a part of the ADAPT Centre at Dublin City University (Grant No: 12/CE/I2267).  ... 
doi:10.1145/3020165.3022154 dblp:conf/chiir/AroraJ17 fatcat:xch24wolfvdfzk7rndinijl63m

Studying Topical Relevance with Evidence-based Crowdsourcing

Oana Inel, Giannis Haralabopoulos, Dan Li, Christophe Van Gysel, Zoltán Szlávik, Elena Simperl, Evangelos Kanoulas, Lora Aroyo
2018 Proceedings of the 27th ACM International Conference on Information and Knowledge Management - CIKM '18  
ABSTRACT Information Retrieval systems rely on large test collections to measure their effectiveness in retrieving relevant documents.  ...  The comparison is based on a series of crowdsourcing pilots experimenting with variables, such as relevance scale, document granularity, annotation template and the number of workers.  ...  The authors would also like to thank the anonymous crowd workers that participated in the crowdsourcing tasks.  ... 
doi:10.1145/3269206.3271779 dblp:conf/cikm/InelHLGSSKA18 fatcat:nwgkage36fcxjkblj4emlqli3a

Quality through flow and immersion

Carsten Eickhoff, Christopher G. Harris, Arjen P. de Vries, Padmini Srinivasan
2012 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval - SIGIR '12  
Based on previous experience as well as psychological insights, we propose the use of a game in order to attract and retain a larger share of reliable workers to frequentlyrequested crowdsourcing tasks  ...  such as relevance assessments and clustering.  ...  Acknowledgements We would like to thank Jiyin He and the Fish4Knowledge project for providing us with the fish images and expert judgements.  ... 
doi:10.1145/2348283.2348400 dblp:conf/sigir/EickhoffHVS12 fatcat:t26baigetzboplrulu6cj76zci
« Previous Showing results 1 — 15 out of 5,143 results