Filters








1,241 Hits in 8.8 sec

The effect of assessor coverage and assessor accuracy on rank aggregation precision

Laurence A. F. Park, Glenn Stone
2015 Proceedings of the 20th Australasian Document Computing Symposium on ZZZ - ADCS '15  
The accuracy of the aggregated ranking depends on the accuracy of the assessor ranking and the assessor coverage of the items.  ...  the assessment on the precision of the aggregated ranking.  ...  In this article, we examine the effect of assessor coverage and assessor accuracy on the accuracy of rank aggregation.  ... 
doi:10.1145/2838931.2838937 dblp:conf/adcs/ParkS15 fatcat:ubr6vqejcfcujplz6feizqnvni

A Study of Realtime Summarization Metrics

Matthew Ekstrand-Abueg, Richard McCreadie, Virgil Pavlu, Fernando Diaz
2016 Proceedings of the 25th ACM International on Conference on Information and Knowledge Management - CIKM '16  
In this paper, we present a study of TREC-TS track evaluation methodology, with the aim of documenting its design, analyzing its effectiveness, as well as identifying improvements and best practices for  ...  The TREC 2013-2015 Temporal Summarization (TREC-TS) track was one of the first evaluation campaigns to tackle the challenges of real-time summarization evaluation, introducing new metrics, ground-truth  ...  Furthermore, we asked assessors to select one or more explanations for the decision.These explanations included: topicality, coverage, quality, redundancy, and timeliness.  ... 
doi:10.1145/2983323.2983653 dblp:conf/cikm/Ekstrand-AbuegM16 fatcat:qb2tuidwk5g4tdkpbhx3xej6vm

Exploiting User Signals and Stochastic Models to Improve Information Retrieval Systems and Evaluation

Maria Maistro
2019 SIGIR Forum  
By modeling relevance judgements and crowd assessors as sources of uncertainty, we directly combine the performance measures computed on the ground-truth generated by each crowd assessor, instead of adopting  ...  We start by providing a formal definition of utility-oriented measurement of retrieval effectiveness, based on the representational theory of measurement.  ...  , which exploits a probabilistic model to infer the accuracy of each assessors and trusts more the assessors with higher accuracy.  ... 
doi:10.1145/3308774.3308805 fatcat:rt7yzttx7vchpijbjaxjkogk3i

Using preference judgments for novel document retrieval

Praveen Chandar, Ben Carterette
2012 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval - SIGIR '12  
based on the number of novel (and redundant) subtopics it is relevant to.  ...  Most work on this problem is based on subtopics: diversity rankers score documents against a set of hypothesized subtopics, and diversity rankings are evaluated by assigning a value to each ranked document  ...  Any opinions, findings and conclusions or recommendations expressed in this material are the authors' and do not necessarily reflect those of the sponsor.  ... 
doi:10.1145/2348283.2348398 dblp:conf/sigir/ChandarC12 fatcat:s4ko5glszncz7azxsmxwvcs7aa

Visual diversification of image search results

Reinier H. van Leuken, Lluis Garcia, Ximena Olivares, Roelof van Zwol
2009 Proceedings of the 18th international conference on World wide web - WWW '09  
assessors.  ...  Due to the reliance on the textual information associated with an image, image search engines on the Web lack the discriminative power to deliver visually diverse search results.  ...  ACKNOWLEDGMENTS The authors express their gratitude toward all the assessors that helped in the establishment of the ground truth.  ... 
doi:10.1145/1526709.1526756 dblp:conf/www/LeukenPOZ09 fatcat:h2b62otiunenvcpkfiypv2dmne

Automatic generation of overview timelines

Russell Swan, James Allan
2000 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '00  
We present a statistical model of feature occurrence over time, and develop tests based on classical hypothesis testing for significance of term appearance on a given date.  ...  To test the validity of our technique we extracted a large number of these topics from a test corpus and had human evaluators judge how well the selected features captured the gist of the topics, and how  ...  (The ordering of the clusters were randomized before being given to the assessors, and the assessors began at different points, so we can rule out order effects.)  ... 
doi:10.1145/345508.345546 dblp:conf/sigir/SwanA00 fatcat:ldgpgrlqczarzcxj76ddttsxii

Enriching Documents with Examples

Jinhan Kim, Sanghoon Lee, Seung-Won Hwang, Sunghun Kim
2013 ACM Transactions on Information Systems  
To address this problem, we propose a novel code example recommendation system that combines the strength of browsing documents and searching for code examples and returns API documents embedded with high-quality  ...  Our evaluation results show that our approach provides code examples with high precision and boosts programmer productivity.  ...  Compared to the gold standard ranking, eXoaDocs achieved high precision and recall from five assessors: 45% and 71%, respectively. Summarization precision/recall.  ... 
doi:10.1145/2414782.2414783 fatcat:auew3xx6jbbjzj5chz2gds2tsy

High-resolution land value maps reveal underestimation of conservation costs in the United States

Christoph Nolte
2020 Proceedings of the National Academy of Sciences of the United States of America  
The justification and targeting of conservation policy rests on reliable measures of public and private benefits from competing land uses.  ...  The resulting estimates predict conservation cost with up to 8.5 times greater accuracy than earlier proxies.  ...  In the United States, the aggregation of nationwide public records on land sales and valuation is largely the domain of commercial endeavors.  ... 
doi:10.1073/pnas.2012865117 pmid:33168741 fatcat:kqahenm33rc77lvebr4j23fa4q

Protein function prediction by massive integration of evolutionary analyses and multiple data sources

Domenico Cozzetto, Daniel WA Buchan, Kevin Bryson, David T Jones
2013 BMC Bioinformatics  
This work was partially supported by the UK Biotechnology and  ...  Acknowledgements We thank Anna Lobley for providing a set of predicted GO terms and pvalues for human protein sequences generated using her FunctionSpace approach.  ...  The graph shows the performance of our aggregate method in comparison with the simple annotation strategies calculated by the CAFA assessors Priors and BLAST.  ... 
doi:10.1186/1471-2105-14-s3-s1 pmid:23514099 pmcid:PMC3584902 fatcat:wbtoxkvicrgttpqettoa2nnd2a

Guidelines and ethical considerations for Assessment Center operations

1989 Journal of business and psychology  
conduct assessment centers; (2) information to managers deciding whether or not to institute assessment center methods; (3) instruction to assessors serving on the staff of an assessment center; and (4  ...  ) guidance on the use of technology and navigating multicultural contexts; (5) information for relevant legal bodies on what are considered standard professional practices in this area.  ...  Second, given recent research on the effectiveness of various assessor training components, the Congress suggested an expansion of the guidelines in this area as well.  ... 
doi:10.1007/bf01016446 fatcat:2bpbaitcefhkdldif4phxixbe4

Guidelines and Ethical Considerations for Assessment Center Operations

Deborah E. Rupp, Brian J. Hoffman, David Bischof, William Byham, Lynn Collins, Alyssa Gibbons, Shinichi Hirose, Martin Kleinmann, Jeffrey D. Kudisch, Martin Lanik, Duncan J. R. Jackson, Myungjoon Kim (+11 others)
2015 Journal of Management  
conduct assessment centers; (2) information to managers deciding whether or not to institute assessment center methods; (3) instruction to assessors serving on the staff of an assessment center; and (4  ...  ) guidance on the use of technology and navigating multicultural contexts; (5) information for relevant legal bodies on what are considered standard professional practices in this area.  ...  Second, given recent research on the effectiveness of various assessor training components, the Congress suggested an expansion of the guidelines in this area as well.  ... 
doi:10.1177/0149206314567780 fatcat:frnd7vcxbfdz7j3gcgzaboi4vi

Stakeholder perspectives on workplace-based performance assessment: towards a better understanding of assessor behaviour

Laury P. J. W. M. de Jonge, Angelique A. Timmerman, Marjan J. B. Govaerts, Jean W. M. Muris, Arno M. M. Muijtjens, Anneke W. M. Kramer, Cees P. M. van der Vleuten
2017 Advances in Health Sciences Education  
Validity in WBA mainly depends on how stakeholders (e.g. clinical supervisors and learners) use the assessments-rather than on the intrinsic qualities of instruments and methods.  ...  Differing perspectives may variously affect stakeholders' acceptance, use-and, consequently, the effectiveness-of assessment programmes.  ...  The effectiveness of assessments may depend on stakeholders' beliefs and their associated perspectives on the assessment process.  ... 
doi:10.1007/s10459-017-9760-7 pmid:28155004 pmcid:PMC5663793 fatcat:m2fxp7ywwff7nibvrbjueu4awu

BIG: An agent for resource-bounded information gathering and decision making

Victor Lesser, Bryan Horling, Frank Klassner, Anita Raja, Thomas Wagner, Shelley XQ. Zhang
2000 Artificial Intelligence  
The large number of information sources and their different levels of accessibility, reliability and associated costs present a complex information gathering control problem.  ...  The World Wide Web has become an invaluable information resource but the explosion of available information has made Web search a time consuming and complex process.  ...  Anita Raja focused primarily on the extraction issues and implemented the grep-ks, cgrep-ks, and table-ks knowledge sources.  ... 
doi:10.1016/s0004-3702(00)00005-9 fatcat:wwxxj2zlzfhchddvetkwry4c2y

Test Collection Based Evaluation of Information Retrieval Systems

Mark Sanderson
2010 Foundations and Trends in Information Retrieval  
Use of test collections and evaluation measures to assess the effectiveness of information retrieval systems has its origins in work dating back to the early 1950s.  ...  This monograph surveys the research conducted and explains the methods and measures devised for evaluation of retrieval systems, including a detailed look at the use of statistical significance testing  ...  found that 250 topics with 20 judgments per topic were the most costeffective in terms of minimizing assessor effort and maximizing accuracy in ranking runs.  ... 
doi:10.1561/1500000009 fatcat:qdacqkqj25eojkpchctdzvrt2e

Indeterminacy in the use of preset criteria for assessment and grading

D. Royce Sadler
2009 Assessment & Evaluation in Higher Education  
When assessment tasks are set for students in universities and colleges, a common practice is to advise them of the criteria that will be used for grading their responses.  ...  Six anomalies in the ways assessors approach the grading task are identified, together with several likely contributing factors.  ...  This effect can also account for some of the apparent agreement on aggregate scores among different assessors, despite disagreement about levels of performance on the separate criteria.  ... 
doi:10.1080/02602930801956059 fatcat:jakm6lxynvfsle5jnfcmft4xia
« Previous Showing results 1 — 15 out of 1,241 results