Filters








100 Hits in 7.2 sec

Crowdsourcing Truthfulness: The Impact of Judgment Scale and Assessor Bias [chapter]

David La Barbera, Kevin Roitero, Gianluca Demartini, Stefano Mizzaro, Damiano Spina
2020 Lecture Notes in Computer Science  
In this work we look at how experts and non-expert assess truthfulness of content by focusing on the effect of the adopted judgment scale and of assessors' own bias on the judgments they perform.  ...  Quantitatively assessing the truthfulness of content becomes key, but it is often challenging and thus done by experts.  ...  This work is partially supported by an Australian Research Council Discovery Project (DP190102141) and a Facebook Research award.  ... 
doi:10.1007/978-3-030-45442-5_26 fatcat:jjpzxi566vcznjhbny6s3vqv4q

The Many Dimensions of Truthfulness: Crowdsourcing Misinformation Assessments on a Multidimensional Scale [article]

Michael Soprano and Kevin Roitero and David La Barbera and Davide Ceolin and Damiano Spina and Stefano Mizzaro and Gianluca Demartini
2021 arXiv   pre-print
A comprehensive analysis of crowdsourced judgments shows that: (1) the crowdsourced assessments are reliable when compared to an expert-provided gold standard; (2) the proposed dimensions of truthfulness  ...  However, fake news are a subtle matter: statements can be just biased ("cherrypicked"), imprecise, wrong, etc. and the unidimensional truth scale used in existing work cannot account for such differences  ...  We thank the reviewers for their comments; they provided insightful remarks that helped us to improve the overall quality of the paper.  ... 
arXiv:2108.01222v1 fatcat:26prpndntffkfmjrfoabd3wz5m

Crowdsourcing for search evaluation

Vitor R. Carvalho, Matthew Lease, Emine Yilmaz
2011 SIGIR Forum  
We believe this analysis can inform future experimental design and analysis when using crowdsourced human judgments.  ...  The development of algorithms for robust prediction of viewer affective response requires corpora accompanied by appropriate ground truth.  ...  Labs and in part by the University of Delaware Research Foundation.  ... 
doi:10.1145/1924475.1924481 fatcat:56ywunsa6vgdvmjmoulohmq5ye

Crowdsourcing for book search evaluation

Gabriella Kazai, Jaap Kamps, Marijn Koolen, Natasa Milic-Frayling
2011 Proceedings of the 34th international ACM SIGIR conference on Research and development in Information - SIGIR '11  
We assess the output in terms of label agreement with a gold standard data set and observe the effect of the crowdsourced relevance judgments on the resulting system rankings.  ...  Increasingly, the use of crowdsourcing to collect relevance labels has been regarded as a viable alternative that scales with modest costs.  ...  judgments, at the scale of an IR test collection.  ... 
doi:10.1145/2009916.2009947 dblp:conf/sigir/KazaiKKM11 fatcat:5ekvpblhrfdehkmey76nkpjlbq

Studying Topical Relevance with Evidence-based Crowdsourcing

Oana Inel, Giannis Haralabopoulos, Dan Li, Christophe Van Gysel, Zoltán Szlávik, Elena Simperl, Evangelos Kanoulas, Lora Aroyo
2018 Proceedings of the 27th ACM International Conference on Information and Knowledge Management - CIKM '18  
The comparison is based on a series of crowdsourcing pilots experimenting with variables, such as relevance scale, document granularity, annotation template and the number of workers.  ...  Finally, the crowdsourced annotation tasks provided a more accurate document relevance ranking than a single assessor relevance label.  ...  All content represents the opinion of the authors, which is not necessarily shared or endorsed by their respective employers and/or sponsors.  ... 
doi:10.1145/3269206.3271779 dblp:conf/cikm/InelHLGSSKA18 fatcat:nwgkage36fcxjkblj4emlqli3a

Annotator Rationales for Labeling Tasks in Crowdsourcing

Mucahid Kutlu, Tyler McDonnell, Matthew Lease, Tamer Elsayed
2020 The Journal of Artificial Intelligence Research  
Firstly, rationales yield a multitude of benefits: more reliable judgments, greater transparency for evaluating both human raters and their judgments, reduced need for expert gold, the opportunity for  ...  dual-supervision from ratings and rationales, and added value from the rationales themselves.  ...  Any opinions, findings, and conclusions or recommendations expressed by the authors are entirely their own and do not represent those of the sponsoring agencies.  ... 
doi:10.1613/jair.1.12012 fatcat:ojnl6r2oorg6bboh74bybjaxte

Creation of Reliable Relevance Judgments in Information Retrieval Systems Evaluation Experimentation through Crowdsourcing: A Review

Parnia Samimi, Sri Devi Ravana
2014 The Scientific World Journal  
This paper is intended to explore different factors that have an influence on the accuracy of relevance judgments accomplished by workers and how to intensify the reliability of judgments in crowdsourcing  ...  In a classic setting, generating relevance judgments involves human assessors and is a costly and time consuming task.  ...  Conflict of Interests The authors declare that there is no conflict of interests regarding the publication of this paper. Acknowledgment  ... 
doi:10.1155/2014/135641 pmid:24977172 pmcid:PMC4055211 fatcat:qfbavfc45jfmzp4yisqyedx2o4

Crowdsourcing for affective-interaction in computer games

Gonçalo Tavares, André Mourão, João Magalhaes
2013 Proceedings of the 2nd ACM international workshop on Crowdsourcing for multimedia - CrowdMM '13  
In this paper we describe a crowdsourcing effort for creating the ground-truth of a large-scale dataset of images capturing users playing a computer game.  ...  The dataset included over 40,000 images, the workers' judgments, the game's detected facial expression and what facial expression the player should be performing.  ...  [15] followed the following rules to improve judgments' quality: (1) assessors annotated a sub-set of the documents with a sub-set of the labels (this avoids the bias caused by having the same person  ... 
doi:10.1145/2506364.2506369 dblp:conf/mm/TavaresMM13 fatcat:gdb7oscc6jdsnlarl5nhlrr3nq

Repeatable and reliable search system evaluation using crowdsourcing

Roi Blanco, Harry Halpin, Daniel M. Herzig, Peter Mika, Jeffrey Pound, Henry S. Thompson, Thanh Tran Duc
2011 Proceedings of the 34th international ACM SIGIR conference on Research and development in Information - SIGIR '11  
Using the first large-scale evaluation campaign that specifically targets the task of ad-hoc Web object retrieval over a number of deployed systems, we demonstrate that crowd-sourced evaluation campaigns  ...  The primary problem confronting any new kind of search task is how to boot-strap a reliable and repeatable evaluation campaign, and a crowd-sourcing approach provides many advantages.  ...  We will use as parameters both the evaluation metric, the number of assessors per item and the relevance scale used.  ... 
doi:10.1145/2009916.2010039 dblp:conf/sigir/BlancoHHMPTT11 fatcat:3v4ewk3rmrek7acjm6tbgjtijq

Deep neural learning on weighted datasets utilizing label disagreement from crowdsourcing

Dongsheng Wang, Prayag Tiwari, Mohammad Shorfuzzaman, Ingo Schmitt
2021 Computer Networks  
A B S T R A C T Experts and crowds can work together to generate high-quality datasets, but such collaboration is limited to a large-scale pool of data.  ...  In other words, training on a large-scale dataset depends more on crowdsourced datasets with aggregated labels than expert intensively checked labels.  ...  ; and b is a bias.  ... 
doi:10.1016/j.comnet.2021.108227 fatcat:sbc2b26fergenn7wvi6duwk3iy

Brief survey of crowdsourcing for data mining

Guo Xintong, Wang Hongzhi, Yangqiu Song, Gao Hong
2014 Expert systems with applications  
Crowdsourcing allows large-scale and flexible invocation of human input for data gathering and analysis, which introduces a new paradigm of data mining process.  ...  We first review the challenges and opportunities of data mining tasks using crowdsourcing, and summarize the framework of them.  ...  The Amazon Mechanical Turk is one of the most famous and largest in scale.  ... 
doi:10.1016/j.eswa.2014.06.044 fatcat:ralo3tmglnat5jncasbz7jnye4

HirePeer: Impartial Peer-Assessed Hiring at Scale in Expert Crowdsourcing Markets

Yasmine Kotturi, Anson Kahng, Ariel Procaccia, Chinmay Kulkarni
2020 PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE  
This paper reports on three studies that investigate both the costs and the benefits to workers and employers of impartial peer-assessed hiring.  ...  Expert crowdsourcing (e.g., Upwork.com) provides promising benefits such as productivity improvements for employers, and flexible working arrangements for workers.  ...  Acknowledgments This work was partially supported by the National Science Foundation under grants IIS-1350598, IIS-1714140, CCF-1525932, and CCF-1733556; by the Office of Naval Research under grants N00014  ... 
doi:10.1609/aaai.v34i03.5641 fatcat:afmnkkrx5vcqhkwepll4ru4l5m

An analysis of human factors and label accuracy in crowdsourcing relevance judgments

Gabriella Kazai, Jaap Kamps, Natasa Milic-Frayling
2012 Information retrieval (Boston)  
Crowdsourcing relevance judgments for the evaluation of search engines is used increasingly to overcome the issue of scalability that hinders traditional approaches relying on a fixed group of trusted  ...  This increases the need for a careful design of crowdsourcing tasks that attracts the right crowd for the given task and promotes quality work.  ...  Examples of bad designs include unclear and ambiguous task instructions, forms that restrict user input, or scales that bias the answers [45] .  ... 
doi:10.1007/s10791-012-9205-0 fatcat:xssec5ojevborgmcvfx72ri3wu

Can The Crowd Identify Misinformation Objectively? The Effects of Judgment Scale and Assessor's Background [article]

Kevin Roitero, Michael Soprano, Shaoyang Fan, Damiano Spina, Stefano Mizzaro, Gianluca Demartini
2020 arXiv   pre-print
This of course leads to the following research question: Can crowdsourcing be reliably used to assess the truthfulness of information and to create large-scale labeled collections for information credibility  ...  To address this issue, we present the results of an extensive study based on crowdsourcing: we collect thousands of truthfulness assessments over two datasets, and we compare expert judgments with crowd  ...  As compared to previous work that looked at crowdsourcing information credibility tasks, we look at the impact of assessors' background and rating scales on the quality of the truthfulness judgments they  ... 
arXiv:2005.06915v1 fatcat:u3utoxzp5ncx3f5dvrvk2zsfy4

Crowd Worker Strategies in Relevance Judgment Tasks

Lei Han, Eddy Maddalena, Alessandro Checco, Cristina Sarasua, Ujwal Gadiraju, Kevin Roitero, Gianluca Demartini
2020 Proceedings of the 13th International Conference on Web Search and Data Mining  
We observe how crowd work experiences result in different types of working strategies, productivity levels, quality and diversity of the crowdsourced judgments.  ...  Existing quality assurance techniques focus on answer aggregation or on the use of gold questions where ground-truth data allows to check for the quality of the responses.  ...  This work is supported by ARC Discovery Project (DP190102141) and the Erasmus+ project DISKOW (60171990).  ... 
doi:10.1145/3336191.3371857 dblp:conf/wsdm/0003MCSGRD20 fatcat:csumgnwrprhynn64dgz6rkagsq
« Previous Showing results 1 — 15 out of 100 results