Filters








31 Hits in 2.5 sec

UnifiedQA: Crossing Format Boundaries With a Single QA System [article]

Daniel Khashabi, Sewon Min, Tushar Khot, Ashish Sabharwal, Oyvind Tafjord, Peter Clark, Hannaneh Hajishirzi
2020 arXiv   pre-print
As evidence, we use the latest advances in language modeling to build a single pre-trained QA model, UnifiedQA, that performs surprisingly well across 17 QA datasets spanning 4 diverse formats.  ...  Finally, simply fine-tuning this pre-trained QA model into specialized models results in a new state of the art on 6 datasets, establishing UnifiedQA as a strong starting point for building QA systems.  ...  Acknowledgments The authors would like to thank Collin Raffel, Adam Roberts, and Nicholas Lourie for their help with the T5 framework and for providing feedback on an earlier version of this work.  ... 
arXiv:2005.00700v3 fatcat:x7ri5waajfeuxlniz37ptsn4pq

UNIFIEDQA: Crossing Format Boundaries with a Single QA System

Daniel Khashabi, Sewon Min, Tushar Khot, Ashish Sabharwal, Oyvind Tafjord, Peter Clark, Hannaneh Hajishirzi
2020 Findings of the Association for Computational Linguistics: EMNLP 2020   unpublished
As evidence, we use the latest advances in language modeling to build a single pre-trained QA model, UNIFIEDQA, that performs well across 20 QA datasets spanning 4 diverse formats.  ...  building QA systems. 1  ...  Acknowledgments The authors would like to thank Collin Raffel, Adam Roberts, and Nicholas Lourie for their help with the T5 framework and for providing feedback on an earlier version of this work.  ... 
doi:10.18653/v1/2020.findings-emnlp.171 fatcat:x3xinbne4vct5kffc3dbkdpixa

MetaQA: Combining Expert Agents for Multi-Skill Question Answering [article]

Haritz Puerto, Gözde Gül Şahin, Iryna Gurevych
2021 arXiv   pre-print
We release our code and a dataset of answer predictions from expert agents for 16 QA datasets to foster future developments of multi-agent systems on https://github.com/UKPLab/MetaQA.  ...  out-of-domain scenarios, ii) is highly data-efficient to train, and iii) can be adapted to any QA format.  ...  Acknowledgements This work has been supported by the German Research Foundation (DFG) as part of the project UKP-SQuARE with the number GU 798/29-1.  ... 
arXiv:2112.01922v2 fatcat:dlxkcmww35ghvoifyvtjhrnjru

ProQA: Structural Prompt-based Pre-training for Unified Question Answering [article]

Wanjun Zhong, Yifan Gao, Ning Ding, Yujia Qin, Zhiyuan Liu, Ming Zhou, Jiahai Wang, Jian Yin, Nan Duan
2022 arXiv   pre-print
To address this issue, we present ProQA, a unified QA paradigm that solves various tasks through a single model.  ...  Furthermore, ProQA is pre-trained with structural prompt-formatted large-scale synthesized corpus, which empowers the model with the commonly-required QA ability.  ...  ., QA datasets). UnifiedQA (Khashabi et al., 2020b) crosses the format boundaries of different QA tasks by formulating them into text-to-text tasks under T5.  ... 
arXiv:2205.04040v1 fatcat:b6kl7zvc7rgc3d6shubgfbxjbq

Unsupervised Pronoun Resolution via Masked Noun-Phrase Prediction [article]

Ming Shen, Pratyay Banerjee, Chitta Baral
2021 arXiv   pre-print
Our method outperforms RoBERTa-large baseline with large margins, meanwhile, achieving a higher AUC score after further finetuning on the remaining three official splits of WinoGrande.  ...  In this work, we propose Masked Noun-Phrase Prediction (MNPP), a pre-training strategy to tackle pronoun resolution in a fully unsupervised setting.  ...  UNIFIEDQA: Crossing for- mat boundaries with a single QA system. In Find- ings of the Association for Computational Linguis- tics: EMNLP 2020, pages 1896-1907, Online.  ... 
arXiv:2105.12392v2 fatcat:jx3mpzdipzhqdk4mwleirekpi4

ActKnow: Active External Knowledge Infusion Learning for Question Answering in Low Data Regime [article]

K. M. Annervaz, Pritam Kumar Nath, Ambedkar Dukkipati
2021 arXiv   pre-print
We propose a technique called ActKnow that actively infuses knowledge from Knowledge Graphs (KG) based "on-demand" into learning for Question Answering (QA).  ...  For example, by using only 20% training examples, we demonstrate a 4% improvement in the accuracy for both ARC-challenge and OpenBookQA, respectively.  ...  Unifiedqa: Crossing format boundaries with a single qa system. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018.  ... 
arXiv:2112.09423v1 fatcat:qoojc2yimbhlpp3goiniri3zbi

Measuring Massive Multitask Language Understanding [article]

Dan Hendrycks, Collin Burns, Steven Basart, Andy Zou, Mantas Mazeika, Dawn Song, Jacob Steinhardt
2021 arXiv   pre-print
We propose a new test to measure a text model's multitask accuracy. The test covers 57 tasks including elementary mathematics, US history, computer science, law, and more.  ...  By comprehensively evaluating the breadth and depth of a model's academic and professional understanding, our test can be used to analyze models across many tasks and to identify important shortcomings  ...  Khot, A. Sabharwal, O. Tafjord, P. Clark, and H. Hajishirzi. Unifiedqa: Crossing format boundaries with a single qa system, 2020. T. Khot, P. Clark, M. Guerquin, P. Jansen, and A. Sabharwal.  ... 
arXiv:2009.03300v3 fatcat:idvsmqeehnhbdmd5z7yz7oszoq

Hey AI, Can You Solve Complex Tasks by Talking to Agents? [article]

Tushar Khot and Kyle Richardson and Daniel Khashabi and Ashish Sabharwal
2022 arXiv   pre-print
We design a synthetic benchmark, CommaQA, with three complex reasoning tasks (explicit, implicit, numeric) designed to be solved by communicating with existing QA agents.  ...  To help develop models that can leverage existing systems, we propose a new challenge: Learning to solve complex tasks by communicating with existing agents (or models) in natural language.  ...  UnifiedQA: Crossing format boundaries with a single QA system. In Findings of EMNLP. Daniel Khashabi, Amos Ng, Tushar Khot, Ashish Sabharwal, Hannaneh Hajishirzi, and Chris Callison-Burch. 2021.  ... 
arXiv:2110.08542v2 fatcat:xuxwyqraybfezo4645p5k4odmu

QA Dataset Explosion: A Taxonomy of NLP Resources for Question Answering and Reading Comprehension [article]

Anna Rogers, Matt Gardner, Isabelle Augenstein
2021 arXiv   pre-print
We further discuss the current classifications of "reasoning types" in question answering and propose a new taxonomy.  ...  We provide an overview of the various formats and domains of the current resources, highlighting the current lacunae for future work.  ...  To this end, UnifiedQA [128] proposes a single "input" format to which they convert extractive, freeform, categorical (boolean) and multi-choice questions from 20 datasets, showing that cross-format  ... 
arXiv:2107.12708v1 fatcat:sfwmrimlgfg4xkmmca6wspec7i

A Few More Examples May Be Worth Billions of Parameters [article]

Yuval Kirstain, Patrick Lewis, Sebastian Riedel, Omer Levy
2021 arXiv   pre-print
be learned with small amounts of labeled data.  ...  We hypothesize that unlike open question answering, which involves recalling specific information, solving strategies for tasks with a more restricted output space transfer across examples, and can therefore  ...  UNIFIEDQA: Crossing format boundaries with a single QA system. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 1896-1907, Online.  ... 
arXiv:2110.04374v1 fatcat:hf2dbw4fsncy7d77y7s3uyedae

Adapting Language Models for Zero-shot Learning by Meta-tuning on Dataset and Prompt Collections [article]

Ruiqi Zhong, Kristy Lee, Zheng Zhang, Dan Klein
2021 arXiv   pre-print
When evaluated on unseen tasks, meta-tuned models outperform a same-sized QA model and the previous SOTA zero-shot learning system based on natural language inference.  ...  We focus on classification tasks, and construct the meta-dataset by aggregating 43 existing datasets and annotating 441 label descriptions in a question-answering (QA) format.  ...  UNIFIEDQA: Crossing format boundaries with a single QA system. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 1896-1907, Online.  ... 
arXiv:2104.04670v5 fatcat:nicxnnusjjg3jdyqch6nzy7y5m

CrossFit: A Few-shot Learning Challenge for Cross-task Generalization in NLP [article]

Qinyuan Ye, Bill Yuchen Lin, Xiang Ren
2021 arXiv   pre-print
NLP datasets and converted to a unified text-to-text format.  ...  Humans can learn a new language task efficiently with only few examples, by leveraging their knowledge obtained when learning prior tasks.  ...  UNIFIEDQA: Crossing for- mat boundaries with a single QA system. In Find- ings of the Association for Computational Linguis- tics: EMNLP 2020, pages 1896-1907, Online.  ... 
arXiv:2104.08835v2 fatcat:xnhrmmsmyzb4fjo7ealrw2vnka

SpartQA: : A Textual Question Answering Benchmark for Spatial Reasoning [article]

Roshanak Mirzaee, Hossein Rajaby Faghihi, Qiang Ning, Parisa Kordjmashidi
2021 arXiv   pre-print
Specifically, we design grammar and reasoning rules to automatically generate a spatial description of visual scenes and corresponding QA pairs.  ...  This paper proposes a question-answering (QA) benchmark for spatial reasoning on natural language text which contains more realistic spatial phenomena not covered by prior work and is challenging for state-of-the-art  ...  UnifiedQA: Crossing for- mat boundaries with a single QA system. In Find- ings of the Association for Computational Linguis- tics: EMNLP 2020, pages 1896-1907.  ... 
arXiv:2104.05832v1 fatcat:et2jdbr5tjh45hgkekyk74tify

Foreseeing the Benefits of Incidental Supervision [article]

Hangfeng He, Mingyuan Zhang, Qiang Ning, Dan Roth
2021 arXiv   pre-print
Experiments on named entity recognition (NER) and question answering (QA) show that PABI's predictions correlate well with learning performance, providing a promising way to determine, ahead of learning  ...  These could include partial labels, noisy labels, knowledge-based constraints, and cross-domain or cross-task annotations -- all having statistical associations with gold annotations but not exactly the  ...  Acknowledgements This material is based upon work supported by the US Defense Advanced Research Projects Agency (DARPA) under contracts FA8750-19-2-0201, W911NF-20-1-0080, and W911NF-15-1-0461, and a grant  ... 
arXiv:2006.05500v2 fatcat:e4yj4zu3ozfylibouota64mdhi

Recursively Summarizing Books with Human Feedback [article]

Jeff Wu, Long Ouyang, Daniel M. Ziegler, Nisan Stiennon, Ryan Lowe, Jan Leike, Paul Christiano
2021 arXiv   pre-print
Our method combines learning from human feedback with recursive task decomposition: we use models trained on smaller parts of the task to assist humans in giving feedback on the broader task.  ...  A major challenge for scaling machine learning is training models to perform tasks that are very difficult or time-consuming for humans to evaluate.  ...  help with and work on pretrained models.  ... 
arXiv:2109.10862v2 fatcat:m6kojegijngefjixcijvl3jwt4
« Previous Showing results 1 — 15 out of 31 results