A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
UnifiedQA: Crossing Format Boundaries With a Single QA System
[article]
2020
arXiv
pre-print
As evidence, we use the latest advances in language modeling to build a single pre-trained QA model, UnifiedQA, that performs surprisingly well across 17 QA datasets spanning 4 diverse formats. ...
Finally, simply fine-tuning this pre-trained QA model into specialized models results in a new state of the art on 6 datasets, establishing UnifiedQA as a strong starting point for building QA systems. ...
Acknowledgments The authors would like to thank Collin Raffel, Adam Roberts, and Nicholas Lourie for their help with the T5 framework and for providing feedback on an earlier version of this work. ...
arXiv:2005.00700v3
fatcat:x7ri5waajfeuxlniz37ptsn4pq
UNIFIEDQA: Crossing Format Boundaries with a Single QA System
2020
Findings of the Association for Computational Linguistics: EMNLP 2020
unpublished
As evidence, we use the latest advances in language modeling to build a single pre-trained QA model, UNIFIEDQA, that performs well across 20 QA datasets spanning 4 diverse formats. ...
building QA systems. 1 ...
Acknowledgments The authors would like to thank Collin Raffel, Adam Roberts, and Nicholas Lourie for their help with the T5 framework and for providing feedback on an earlier version of this work. ...
doi:10.18653/v1/2020.findings-emnlp.171
fatcat:x3xinbne4vct5kffc3dbkdpixa
MetaQA: Combining Expert Agents for Multi-Skill Question Answering
[article]
2021
arXiv
pre-print
We release our code and a dataset of answer predictions from expert agents for 16 QA datasets to foster future developments of multi-agent systems on https://github.com/UKPLab/MetaQA. ...
out-of-domain scenarios, ii) is highly data-efficient to train, and iii) can be adapted to any QA format. ...
Acknowledgements This work has been supported by the German Research Foundation (DFG) as part of the project UKP-SQuARE with the number GU 798/29-1. ...
arXiv:2112.01922v2
fatcat:dlxkcmww35ghvoifyvtjhrnjru
ProQA: Structural Prompt-based Pre-training for Unified Question Answering
[article]
2022
arXiv
pre-print
To address this issue, we present ProQA, a unified QA paradigm that solves various tasks through a single model. ...
Furthermore, ProQA is pre-trained with structural prompt-formatted large-scale synthesized corpus, which empowers the model with the commonly-required QA ability. ...
., QA datasets). UnifiedQA (Khashabi et al., 2020b) crosses the format boundaries of different QA tasks by formulating them into text-to-text tasks under T5. ...
arXiv:2205.04040v1
fatcat:b6kl7zvc7rgc3d6shubgfbxjbq
Unsupervised Pronoun Resolution via Masked Noun-Phrase Prediction
[article]
2021
arXiv
pre-print
Our method outperforms RoBERTa-large baseline with large margins, meanwhile, achieving a higher AUC score after further finetuning on the remaining three official splits of WinoGrande. ...
In this work, we propose Masked Noun-Phrase Prediction (MNPP), a pre-training strategy to tackle pronoun resolution in a fully unsupervised setting. ...
UNIFIEDQA: Crossing for-
mat boundaries with a single QA system. In Find-
ings of the Association for Computational Linguis-
tics: EMNLP 2020, pages 1896-1907, Online. ...
arXiv:2105.12392v2
fatcat:jx3mpzdipzhqdk4mwleirekpi4
ActKnow: Active External Knowledge Infusion Learning for Question Answering in Low Data Regime
[article]
2021
arXiv
pre-print
We propose a technique called ActKnow that actively infuses knowledge from Knowledge Graphs (KG) based "on-demand" into learning for Question Answering (QA). ...
For example, by using only 20% training examples, we demonstrate a 4% improvement in the accuracy for both ARC-challenge and OpenBookQA, respectively. ...
Unifiedqa: Crossing format boundaries with a single qa system.
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. ...
arXiv:2112.09423v1
fatcat:qoojc2yimbhlpp3goiniri3zbi
Measuring Massive Multitask Language Understanding
[article]
2021
arXiv
pre-print
We propose a new test to measure a text model's multitask accuracy. The test covers 57 tasks including elementary mathematics, US history, computer science, law, and more. ...
By comprehensively evaluating the breadth and depth of a model's academic and professional understanding, our test can be used to analyze models across many tasks and to identify important shortcomings ...
Khot, A. Sabharwal, O. Tafjord, P. Clark, and H. Hajishirzi. Unifiedqa: Crossing
format boundaries with a single qa system, 2020.
T. Khot, P. Clark, M. Guerquin, P. Jansen, and A. Sabharwal. ...
arXiv:2009.03300v3
fatcat:idvsmqeehnhbdmd5z7yz7oszoq
Hey AI, Can You Solve Complex Tasks by Talking to Agents?
[article]
2022
arXiv
pre-print
We design a synthetic benchmark, CommaQA, with three complex reasoning tasks (explicit, implicit, numeric) designed to be solved by communicating with existing QA agents. ...
To help develop models that can leverage existing systems, we propose a new challenge: Learning to solve complex tasks by communicating with existing agents (or models) in natural language. ...
UnifiedQA: Crossing format boundaries with a single QA system. In Findings of EMNLP. Daniel Khashabi, Amos Ng, Tushar Khot, Ashish Sabharwal, Hannaneh Hajishirzi, and Chris Callison-Burch. 2021. ...
arXiv:2110.08542v2
fatcat:xuxwyqraybfezo4645p5k4odmu
QA Dataset Explosion: A Taxonomy of NLP Resources for Question Answering and Reading Comprehension
[article]
2021
arXiv
pre-print
We further discuss the current classifications of "reasoning types" in question answering and propose a new taxonomy. ...
We provide an overview of the various formats and domains of the current resources, highlighting the current lacunae for future work. ...
To this end, UnifiedQA [128] proposes a single "input" format to which they convert extractive, freeform, categorical (boolean) and multi-choice questions from 20 datasets, showing that cross-format ...
arXiv:2107.12708v1
fatcat:sfwmrimlgfg4xkmmca6wspec7i
A Few More Examples May Be Worth Billions of Parameters
[article]
2021
arXiv
pre-print
be learned with small amounts of labeled data. ...
We hypothesize that unlike open question answering, which involves recalling specific information, solving strategies for tasks with a more restricted output space transfer across examples, and can therefore ...
UNIFIEDQA: Crossing format boundaries with a single QA system. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 1896-1907, Online. ...
arXiv:2110.04374v1
fatcat:hf2dbw4fsncy7d77y7s3uyedae
Adapting Language Models for Zero-shot Learning by Meta-tuning on Dataset and Prompt Collections
[article]
2021
arXiv
pre-print
When evaluated on unseen tasks, meta-tuned models outperform a same-sized QA model and the previous SOTA zero-shot learning system based on natural language inference. ...
We focus on classification tasks, and construct the meta-dataset by aggregating 43 existing datasets and annotating 441 label descriptions in a question-answering (QA) format. ...
UNIFIEDQA: Crossing format boundaries with a single QA system. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 1896-1907, Online. ...
arXiv:2104.04670v5
fatcat:nicxnnusjjg3jdyqch6nzy7y5m
CrossFit: A Few-shot Learning Challenge for Cross-task Generalization in NLP
[article]
2021
arXiv
pre-print
NLP datasets and converted to a unified text-to-text format. ...
Humans can learn a new language task efficiently with only few examples, by leveraging their knowledge obtained when learning prior tasks. ...
UNIFIEDQA: Crossing for-
mat boundaries with a single QA system. In Find-
ings of the Association for Computational Linguis-
tics: EMNLP 2020, pages 1896-1907, Online. ...
arXiv:2104.08835v2
fatcat:xnhrmmsmyzb4fjo7ealrw2vnka
SpartQA: : A Textual Question Answering Benchmark for Spatial Reasoning
[article]
2021
arXiv
pre-print
Specifically, we design grammar and reasoning rules to automatically generate a spatial description of visual scenes and corresponding QA pairs. ...
This paper proposes a question-answering (QA) benchmark for spatial reasoning on natural language text which contains more realistic spatial phenomena not covered by prior work and is challenging for state-of-the-art ...
UnifiedQA: Crossing for-
mat boundaries with a single QA system. In Find-
ings of the Association for Computational Linguis-
tics: EMNLP 2020, pages 1896-1907. ...
arXiv:2104.05832v1
fatcat:et2jdbr5tjh45hgkekyk74tify
Foreseeing the Benefits of Incidental Supervision
[article]
2021
arXiv
pre-print
Experiments on named entity recognition (NER) and question answering (QA) show that PABI's predictions correlate well with learning performance, providing a promising way to determine, ahead of learning ...
These could include partial labels, noisy labels, knowledge-based constraints, and cross-domain or cross-task annotations -- all having statistical associations with gold annotations but not exactly the ...
Acknowledgements This material is based upon work supported by the US Defense Advanced Research Projects Agency (DARPA) under contracts FA8750-19-2-0201, W911NF-20-1-0080, and W911NF-15-1-0461, and a grant ...
arXiv:2006.05500v2
fatcat:e4yj4zu3ozfylibouota64mdhi
Recursively Summarizing Books with Human Feedback
[article]
2021
arXiv
pre-print
Our method combines learning from human feedback with recursive task decomposition: we use models trained on smaller parts of the task to assist humans in giving feedback on the broader task. ...
A major challenge for scaling machine learning is training models to perform tasks that are very difficult or time-consuming for humans to evaluate. ...
help with and work on pretrained models. ...
arXiv:2109.10862v2
fatcat:m6kojegijngefjixcijvl3jwt4
« Previous
Showing results 1 — 15 out of 31 results