Filters








56,720 Hits in 4.0 sec

Models in the Loop: Aiding Crowdworkers with Generative Annotation Assistants [article]

Max Bartolo, Tristan Thrush, Sebastian Riedel, Pontus Stenetorp, Robin Jia, Douwe Kiela
2022 arXiv   pre-print
We collect training datasets in twenty experimental settings and perform a detailed analysis of this approach for the task of extractive question answering (QA) for both standard and adversarial data collection  ...  In addition, we find that using GAA-assisted training data leads to higher downstream model performance on a variety of question answering tasks over adversarial data collection.  ...  Acknowledgements The authors would like to thank the Dynabench team for their feedback and continuous support.  ... 
arXiv:2112.09062v3 fatcat:uibenpoe6fba3crbxbo3foovem

Using FHIR to Construct a Corpus of Clinical Questions Annotated with Logical Forms and Answers

Sarvesh Soni, Meghana Gudala, Daisy Zhe Wang, Kirk Roberts
2020 AMIA Annual Symposium Proceedings  
This paper describes a novel technique for annotating logical forms and answers for clinical questions by utilizing Fast Healthcare Interoperability Resources (FHIR).  ...  Using the proposed approach, two annotators curated an annotated dataset of 1000 questions in less than 1 week.  ...  We analyzed the combinations of FHIR resource and answer types in our dataset to get the impression of most frequent question varieties.  ... 
pmid:32308918 pmcid:PMC7153115 fatcat:bmpwmzhdzja2nbds5wdzfuq2ey

Collecting high-quality adversarial data for machine reading comprehension tasks with humans and models in the loop [article]

Damian Y. Romero Diaz, Magdalena Anioł, John Culnan
2022 arXiv   pre-print
We present our experience as annotators in the creation of high-quality, adversarial machine-reading-comprehension data for extractive QA for Task 1 of the First Workshop on Dynamic Adversarial Data Collection  ...  We set up a quasi-experimental annotation design and perform quantitative analyses across groups with different numbers of annotators focusing on successful adversarial attacks, cost analysis, and annotator  ...  Anders Søgaard for his valuable insights during the revision of this article.  ... 
arXiv:2206.14272v1 fatcat:46bzsps7djcm3pzbbbgidjldyu

Comparing Cancer Information Needs for Consumers in the US and China

Zongcheng Ji, Yaoyun Zhang, Jun Xu, Xiaoling Chen, Yonghui Wu, Hua Xu
2017 Studies in Health Technology and Informatics  
This study compares the cancer information needs for consumers in the US and China. Specifically, we first collected 1,000 cancer-related questions from Yahoo! Answers and Baidu Zhidao, respectively.  ...  Then, we developed a taxonomy of health information needs and manually classified the questions using the taxonomy.  ...  Therefore, it is necessary to build automated informatics tools such as search engine or question answering systems that can integrate the rich resources and provide more efficient and effective cancer  ... 
pmid:29295066 pmcid:PMC5805146 fatcat:ryzxghies5a7blvil2paqcbr4i

A Knowledge Graph Question-Answering Platform Trained Independently of the Graph

Reham Omar, Ishika Dhall, Nadia Sheikh, Essam Mansour
2021 International Semantic Web Conference  
Without preprocessing or annotated questions on KGs, KGQAn outperformed the existing systems in KG question answering by an improvement of at least 33% in F1-measure and 61% in precision.  ...  During the demo, the audience will experience KGQAn for question answering on real KGs of topics of interest to them, such as DBpedia and OpenCitations Graph, and review the generated SPARQL queries and  ...  KGs are frequently updated, i.e., these systems will need to get more annotated questions or redo the preprocessing. Hence, there is a need for novel techniques that are trained independently of KGs.  ... 
dblp:conf/semweb/OmarDSM21 fatcat:k3xyhdv3fvcj5ke4kvuphjxzvq

Bridging the Gap Between Consumers' Medication Questions and Trusted Answers

Asma Ben Abacha, Yassine Mrabet, Mark Sharp, Travis R Goodwin, Sonya E Shooshan, Dina Demner-Fushman
2019 Studies in Health Technology and Informatics  
We first present the manual annotation and answering process.  ...  The gold standard (https://github.com/abachaa/Medication_QA_MedInfo2019) consists of six hundred and seventy-four question-answer pairs with annotations of the question focus and type and the answer source  ...  Figure 1 presents a word cloud of the most frequent terms in the selected consumer health questions. Annotating the Questions.  ... 
doi:10.3233/shti190176 pmid:31437878 fatcat:qdk5ua76mve5rn7gtiigohm45e

Implications for generating clarification requests in task-oriented dialogues

Verena Rieser, Johanna D. Moore
2005 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics - ACL '05  
Acknowledgements The authors would like thank Kepa Rodriguez, Oliver Lemon, and David Reitter for help and discussion.  ...  Alternative questions prompt the addressee to disambiguate the hypothesis. Answer: By definition, certain types of question prompt for certain answers.  ...  See (Rieser, 2004) for a detailed discussion. The annotation was only performed once.  ... 
doi:10.3115/1219840.1219870 dblp:conf/acl/RieserM05 fatcat:chvia5hjx5fchkvdrotbos2hfi

A Task-Based Evaluation of French Morphological Resources and Tools

Delphine Bernhard, Bruno Cartoni, Delphine Tribout
2011 Linguistic Issues in Language Technology  
We first describe an annotation experiment whose goal is to evaluate the role of morphology for one task, namely Question Answering (QA).  ...  Morphology is a key component for many Language Technology applications.  ...  Overall, the dataset we annotated comprises 664 question-answer pairs, for 201 different questions.  ... 
doi:10.33011/lilt.v5i.1229 fatcat:w7lkdglncbechbv5tkcda333ri

COALA - A Rule-Based Approach to Answer Type Prediction

Nadine Steinmetz, Kai-Uwe Sattler
2020 International Semantic Web Conference  
For answering a question correctly, the previous detection of the answer type is essential.  ...  Especially in the field of Question Answering (QA) over knowledge bases, answers might be of many different types as natural language is ambiguous and a question might lead to different relevant queries  ...  For predicting the answer type using a classifier, a large dataset for annotation and obviously human annotators are required.  ... 
dblp:conf/semweb/SteinmetzS20 fatcat:rbmg2kho6ncjzaxocfjbchg43e

Active Learning with Partial Feedback [article]

Peiyun Hu, Zachary C. Lipton, Anima Anandkumar, Deva Ramanan
2019 arXiv   pre-print
To annotate examples corpora for multiclass classification, we might need to ask multiple yes/no questions, exploiting a label hierarchy if one is available.  ...  Each answer eliminates some classes, leaving the learner with a partial label.  ...  Since 30k questions per re-training (for TinyImagenet) seems infrequent, we compared against 10x more frequent re-training More frequent training conferred no benefit (Appendix B).  ... 
arXiv:1802.07427v4 fatcat:tge6ssrhxrg25awmcgk5l5wsxe

Beyond information retrieval--medical question answering

Minsuk Lee, James Cimino, Hai R Zhu, Carl Sable, Vijay Shanker, John Ely, Hong Yu
2006 AMIA Annual Symposium Proceedings  
Physicians have many questions when caring for patients, and frequently need to seek answers for their questions.  ...  automatically generate paragraph-level text for definitional questions (i.e., "What is X?").  ...  capacity of LT CHUNK to efficiently capture noun phrases of medical questions.  ... 
pmid:17238385 pmcid:PMC1839371 fatcat:4wfvs42idnf4xpgedvlarovgpq

Gender and Racial Bias in Visual Question Answering Datasets [article]

Yusuke Hirota, Yuta Nakashima, Noa Garcia
2022 arXiv   pre-print
A popular task in the field is visual question answering (VQA), which aims to answer questions about images.  ...  For this reason, we investigate gender and racial bias in five VQA datasets.  ...  Top-20 frequent answers for the question type what is this in VQA 2.0. Above: Frequent answers for women questions (orange).Below: Frequent answers for men questions (green).  ... 
arXiv:2205.08148v2 fatcat:d366jca56fdyhjsa4dvlb4hsge

Sequential dialogue act recognition for Arabic argumentative debates

Samira Ben Dbabis, Hatem Ghorbel, Lamia Hadrich Belguith
2018 Revista de Procesamiento de Lenguaje Natural (SEPLN)  
Learning results are notably important for the segmentation task (F-score=97.9%) and relatively reliable within the annotation process (f-score=63.4%) given the complexity of identifying argumentative  ...  CARD corpus labeled using the SADA annotation schema.  ...  As a lexical characteristic, we focus on punctuation as a determinant clue that occurs frequently and the end of an utterance. For example question marks mostly delimit the end of a question.  ... 
dblp:journals/pdln/DbabisGB18 fatcat:wadsxt7tyrhy7dp64rml2v7ntu

DuReader: a Chinese Machine Reading Comprehension Dataset from Real-world Applications

Wei He, Kai Liu, Jing Liu, Yajuan Lyu, Shiqi Zhao, Xinyan Xiao, Yuan Liu, Yizhong Wang, Hua Wu, Qiaoqiao She, Xuan Liu, Tian Wu (+1 others)
2018 Proceedings of the Workshop on Machine Reading for Question Answering  
provides rich annotations for more question types, especially yes-no and opinion questions, that leaves more opportunity for the research community. (3) scale: it contains 200K questions, 420K answers  ...  DuReader has three advantages over previous MRC datasets: (1) data sources: questions and documents are based on Baidu Search and Baidu Zhidao 1 ; answers are manually generated. (2) question types: it  ...  Kenneth Ward Church for his valuable suggestions and revisions on this paper, Prof. Sujian Li for her supports on this paper, and the anonymous reviewers for their helpful comments on this work.  ... 
doi:10.18653/v1/w18-2605 dblp:conf/acl/HeLLLZXLWWSLWW18 fatcat:oh27suoxmvaipjiavjpupblm74

DuReader: a Chinese Machine Reading Comprehension Dataset from Real-world Applications [article]

Wei He, Kai Liu, Jing Liu, Yajuan Lyu, Shiqi Zhao, Xinyan Xiao, Yuan Liu, Yizhong Wang, Hua Wu, Qiaoqiao She, Xuan Liu, Tian Wu (+1 others)
2018 arXiv   pre-print
rich annotations for more question types, especially yes-no and opinion questions, that leaves more opportunity for the research community. (3) scale: it contains 200K questions, 420K answers and 1M documents  ...  DuReader has three advantages over previous MRC datasets: (1) data sources: questions and documents are based on Baidu Search and Baidu Zhidao; answers are manually generated. (2) question types: it provides  ...  Kenneth Ward Church for his valuable suggestions and revisions on this paper, Prof. Sujian Li for her supports on this paper, and the anonymous reviewers for their helpful comments on this work.  ... 
arXiv:1711.05073v4 fatcat:b5xrn4a5xbg2jghdszl57ztrim
« Previous Showing results 1 — 15 out of 56,720 results