444 Hits in 6.4 sec

Cross-replication Reliability – An Empirical Approach to Interpreting Inter-rater Reliability [article]

Ka Wong, Praveen Paritosh, Lora Aroyo
2021 arXiv   pre-print
We argue this framework can be used to measure the quality of crowdsourced datasets.  ...  It is based upon benchmarking IRR against baseline measures in a replication, one of which is a novel cross-replication reliability (xRR) measure based on Cohen's kappa.  ...  Acknowledgments We like to thank Gautam Prasad and Alan Cowen for their work on collecting and sharing the IRep dataset and opensourcing it.  ... 
arXiv:2106.07393v1 fatcat:5prn3rqktzhudjd2y7to2glule

When zero may not be zero: A cautionary note on the use of inter‐rater reliability in evaluating grant peer review

Elena A. Erosheva, Patrícia Martinková, Carole J. Lee
2021 Journal of the Royal Statistical Society: Series A (Statistics in Society)  
Considerable attention has focused on studying reviewer agreement via inter-rater reliability (IRR) as a way to assess the quality of the peer review process.  ...  Inspired by a recent study that reported an IRR of zero in the mock peer review of top-quality grant proposals, we use real data from a complete range of submissions to the National Institutes of Health  ...  for assistance with implementing the restricted-range reliability functionalities into the R package and interactive Shiny application ShinyItemAnalysis.  ... 
doi:10.1111/rssa.12681 fatcat:ekwjcbzuuzgldir5jtm75stgda

RecSys'17 Joint Workshop on Interfaces and Human Decision Making for Recommender Systems

Peter Brusilovsky, Marco de Gemmis, Alexander Felfernig, Pasquale Lops, John O'Donovan, Nava Tintarev, Martijn Willemsen
2017 Proceedings of the Eleventh ACM Conference on Recommender Systems - RecSys '17  
We experiment on real-world and artificially generated data, finding that treating label ratings as ordinal, rather than interval data results in an increased inter-rater reliability.  ...  We hypothesize that the issues arising from rater bias may be mitigated by treating the data received as an ordered set of preferences rather than a collection of absolute values.  ...  This proposition relies on the intuition that corpora of relatively-valued labels will produce higher levels of inter-rater reliability (IRR) than those consisting of absolute values.  ... 
doi:10.1145/3109859.3109961 dblp:conf/recsys/BrusilovskyGFLO17 fatcat:vishcvo5jrdbnnj24ncydrbcfm

Further Data on the Reliability of the Mentalization Imbalances Scale and of the Modes of Mentalization Scale

Giulia Gagliardini, Laura Gatti, Antonello Colli
2020 Research in Psychotherapy Psychopathology Process and Outcome  
The aim of this study was to provide data on the Inter-Rater Reliability (IRR) and the test-retest reliability of the Mentalization Imbalances Scale (MIS) and the Modes of Mentalization Scale (MMS) in  ...  Our results provide support to the inter-rater reliability of the MIS and the MMS.  ...  Acknowledgments: The authors would like to thank the clinicians and raters who participated to this study by providing their evaluations.  ... 
doi:10.4081/ripppo.2020.450 pmid:32913829 pmcid:PMC7451392 fatcat:hewcgb4r3bgbxii66vclg34yny

Page 1369 of Psychological Abstracts Vol. 85, Issue 4 [page]

1998 Psychological Abstracts  
The Neurological Eval- uation Scale (NES), the most widely used structured neurological examination in schizophrenia research, has had limited study of its inter-rater reliability (IRR) An augmented version  ...  (Wright State U, School of Medicine, Dept of Psychiatry, Dayton, OH) inter-rater reliability of the neurological examination in schizophrenia. Schizophrenia Re- search, 1998(Feb), Vol 29(3), 287-292.  ... 

Development of the Patient Education Materials Assessment Tool (PEMAT): A new measure of understandability and actionability for print and audiovisual patient information

Sarah J. Shoemaker, Michael S. Wolf, Cindy Brach
2014 Patient Education and Counseling  
Four rounds of reliability testing and refinement were conducted using raters untrained on the PEMAT. Agreement improved across rounds.  ...  We completed four rounds of reliability testing, and produced evidence of construct validity with consumers and readability assessments. Results-The experts deemed the PEMAT items face/content valid.  ...  Ross Davies for her guidance on instrument development, and Ken Carlson and Mark Spranca from Abt Associates for their valuable engagement with the reliability and validity testing of the PEMAT.  ... 
doi:10.1016/j.pec.2014.05.027 pmid:24973195 pmcid:PMC5085258 fatcat:2mcsz6spwvbotafweqpetphpjy

Assessment tool for hospital admissions related to medications: development and validation in older patients

Thomas G. H. Kempen, Mariann Hedström, Hanna Olsson, Amanda Johansson, Sara Ottosson, Yousif Al-Sammak, Ulrika Gillespie
2018 International Journal of Clinical Pharmacy  
The tool's inter-rater reliability (IRR) and criterion-related validity (CRV) were assessed: four pairs of either final-year undergraduate or postgraduate pharmacy students applied the tool to one of two  ...  Method We reviewed existing literature on methods to identify MRAs. The tool AT-HARM10 was developed using an iterative process including content validity and feasibility testing.  ...  Acknowledgements We are sincerely grateful to Dr. Christina Grzechnik Mörk for her contribution as one of the gold standard experts.  ... 
doi:10.1007/s11096-018-0768-8 fatcat:yp3oe2j7tbfohkhb2zdgtdgaye

Determining Intervention Fidelity From Chronological Field Notes

Jo Dowell, Linda Beeber, Todd Schwartz
2015 Journal of Nursing Measurement  
We computed inter-reliability (IRR) between the two raters and internal consistency on adherence for each of the theoretically-derived subscales: interpersonal psychotherapy (IPT) and cognitive behavior  ...  Inter-rater reliability (IRR) helps to establish the extent of consensus between the two raters using CSPRS instrument in rating field notes.  ...  Do not put your name on the form. Your response is anonymous. We encourage you to be frank and honest in your evaluation. Please indicate your answers on the computerized answer sheet.  ... 
doi:10.1891/1061-3749.23.2.e67 pmid:26284832 fatcat:cbwrbduzcrci7fklrgfoypuslm

Level of personality functioning as a predictor of psychosocial functioning—Concurrent validity of criterion A

Tore Buer Christensen, Ingeborg Eikenaes, Benjamin Hummelen, Geir Pedersen, Tor-Erik Nysæter, Donna S. Bender, Andrew E. Skodol, Sara Germans Selvik
2019 Personality Disorders: Theory, Research, and Treatment  
The association between the Level of Personality Functioning Scale and psychosocial impairment based on other previously established psychosocial functioning instruments has not been reported.  ...  These four domains constitute the Level of Personality Functioning Scale, a trans-diagnostic measure of PD severity.  ...  The research on the LPFS is still in its adolescence, but it is to be expected that the increasing amount of research on the AMPD will pave the way for an empirically supported diagnostic model for PDs  ... 
doi:10.1037/per0000352 pmid:31580097 fatcat:e6lzxskhk5f25niidwzfwo26p4

Bridging the "last mile" gap between AI implementation and operation: "data awareness" that matters

Federico Cabitza, Andrea Campagner, Clara Balsano
2020 Annals of Translational Medicine  
in those medical disciplines that extensively rely on digital imaging.  ...  The latter hiatus, on the other hand, relates to the production and availability of a sufficient amount of reliable and accurate clinical data that is suitable to be the "experience" with which a machine  ...  Acknowledgments The authors are grateful to the Scientific Department of the IRCCS Galeazzi, and to its head, prof. Giuseppe Banfi, for their continuous and unconditional support. Funding: None.  ... 
doi:10.21037/atm.2020.03.63 pmid:32395545 pmcid:PMC7210125 fatcat:q5ynl23k5jfztegm77lakxwvwi

Interrater Reliability at the Top End: Measures of Pilots' Nontechnical Performance

Patrick Gontar, Hans-Juergen Hoermann
2015 The International journal of aviation psychology  
For cognitive aspects of 19 performance, inter-rater reliability was higher than for social aspects of performance. 20 Agreement was lower on the pass/fail level than for the distinguished performance  ...  The aim of this study is to analyze influences on inter-rater reliability and 6 within-group agreement within a highly experienced rater group when assessing pilots' 7 non-technical skills. 8 Background  ...  It 381 can be concluded that the agreement of raters depended on the level of performance that was 382 ICC(3) for inter-rater reliability was found to be poor for the dimensions communication384 (.12),  ... 
doi:10.1080/10508414.2015.1162636 fatcat:uxxo5sjykfg3lfyy6k3zdkhnza

Development and validation of the Cerebral Performance Categories-Extended (CPC-E)

Sondra A. Balouris, Ketki D. Raina, Jon C. Rittenberger, Clifton W. Callaway, Joan C. Rogers, Margo B. Holm
2015 Resuscitation  
We tested the CPC-E's intra-rater reliability (IR) percent agreement (n = 30; range = 73.3% -100%) and inter-rater reliability (IRR) (n = 50; range = 60% -100%) using retrospective chart reviews of the  ...  The specific aims were to establish the CPC-E's content validity, and to test its reliability, and feasibility in the hospital setting.  ...  Complex Activities of Daily Living (CADLs): Responsible for own medication (medication management), food preparation, shopping and transportation (drives or uses public transportation)  ... 
doi:10.1016/j.resuscitation.2015.05.013 pmid:26025569 fatcat:qrsmqnekjbebbfhldcsaxnj5w4

Metrology for AI: From Benchmarks to Instruments [article]

Chris Welty, Praveen Paritosh, Lora Aroyo
2019 arXiv   pre-print
We begin with the intuitive observation that evaluating the performance of an AI system is a form of measurement.  ...  One does not report mass, speed, or length, for example, of a studied object without disclosing the precision (measurement variance) and resolution (smallest detectable change) of the instrument used.  ...  Thus, a slew of inter-annotator agreement (also called inter-rater reliability, or IRR) metrics such as Fleiss' Kappa, or Cohen's pi, which was then generalized to all different scales by Krippendorff's  ... 
arXiv:1911.01875v1 fatcat:clcnimrspbhwvbsvfc4h5rv3hq

Reliability of infarct volumetry: Its relevance and the improvement by a software-assisted approach

Felix Friedländer, Ferdinand Bohmann, Max Brunkhorst, Ju-Hee Chae, Kavi Devraj, Yvette Köhler, Peter Kraft, Hannah Kuhn, Alexandra Lucaciu, Sebastian Luger, Waltraud Pfeilschifter, Rebecca Sadler (+5 others)
2016 Journal of Cerebral Blood Flow and Metabolism  
to an unrecognized low inter-rater and test-retest reliability with strong implications for statistical power and bias.  ...  In addition, we show the probable consequences of increased reliability for precision, p-values, effect inflation, and power calculation, exemplified by a systematic analysis of experimental stroke studies  ...  In order to analyze the effect of reliability on the precision of the observed effect, we took advantage of the assumptions that a. the t-test can be seen as a linear model stroke size ¼ Ã treatment group  ... 
doi:10.1177/0271678x16681311 pmid:27909266 pmcid:PMC5536806 fatcat:zt2l2opqnncijgdsgoqbrkpoci

Assessing Momentary Well-Being in People Living With Dementia: A Systematic Review of Observational Instruments

Kristine Gustavsen Madsø, Elisabeth Flo-Groeneboom, Nancy A. Pachana, Inger Hilde Nordhus
2021 Frontiers in Psychology  
, measurement invariance, cross-cultural validity, measurement error and inter-rater/intra-rater/test–retest reliability and responsiveness.  ...  Twenty-two instruments assessing well-being were included for evaluation of measurement properties based on the systematic approach of the COnsensus-based Standards for the selection of health Measurement  ...  ACKNOWLEDGMENTS We wish to acknowledge librarian Kjersti Aksnes-Hopland at the University of Bergen Library for her important advice about search strategies, databases and tools for deduplication and management  ... 
doi:10.3389/fpsyg.2021.742510 pmid:34887803 pmcid:PMC8649635 fatcat:rfbqlnvsunbgdelq654dmg4vpq
« Previous Showing results 1 — 15 out of 444 results