Filters








357 Hits in 3.0 sec

Unsupervised Question Answering by Cloze Translation

Patrick Lewis, Ludovic Denoyer, Sebastian Riedel
2019 Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics  
We propose and compare various unsupervised ways to perform cloze-tonatural question translation, including training an unsupervised NMT model using nonaligned corpora of natural questions and cloze questions  ...  Next we convert answers in context to "fill-in-the-blank" cloze questions and finally translate them into natural questions.  ...  Supplementary Materials for ACL 2019 Paper: Unsupervised Question Answering by Cloze Translation A Appendices A.1 Cloze Question Featurization and Translation High Level Answer Category Named  ... 
doi:10.18653/v1/p19-1484 dblp:conf/acl/LewisDR19 fatcat:ta6lu2buvbanplnqs6zjkqrnme

Harvesting and Refining Question-Answer Pairs for Unsupervised QA [article]

Zhongli Li, Wenhui Wang, Li Dong, Furu Wei, Ke Xu
2020 arXiv   pre-print
Our approach outperforms previous unsupervised approaches by a large margin and is competitive with early supervised models.  ...  First, we harvest lexically and syntactically divergent questions from Wikipedia to automatically construct a corpus of question-answer pairs (named as RefQA).  ...  Acknowledgements The work was partially supported by National Natural Science Foundation of China (NSFC) [Grant No. 61421003].  ... 
arXiv:2005.02925v1 fatcat:3zlvm7bsrje4jcs6pbvtyodyyy

How Context Affects Language Models' Factual Predictions [article]

Fabio Petroni, Patrick Lewis, Aleksandra Piktus, Tim Rocktäschel, Yuxiang Wu, Alexander H. Miller, Sebastian Riedel
2020 arXiv   pre-print
question answering.  ...  question-answering performance and making its predictions robust to noisy contexts.  ...  Unsupervised Question Answering Our work is part of growing body of work that demonstrate that unsupervised question answering is not only possible, but beginning to reach and even outperform some standard  ... 
arXiv:2005.04611v1 fatcat:uylkyzvscve5dgcofihq7g3mfu

Self-Supervised Test-Time Learning for Reading Comprehension [article]

Pratyay Banerjee, Tejas Gokhale, Chitta Baral
2021 arXiv   pre-print
Recent work on unsupervised question answering has shown that models can be trained with procedurally generated question-answer pairs and can achieve performance competitive with supervised methods.  ...  human-authored datasets containing context-question-answer triplets.  ...  Cloze Translation is utilized to rephrase cloze questions into more natural questions by using rulebased methods from .  ... 
arXiv:2103.11263v1 fatcat:6tqbqs5a45bupn6pxh3ajy6xvy

When in Doubt, Ask: Generating Answerable and Unanswerable Questions, Unsupervised [article]

Liubov Nikolenko, Pouya Rezazadeh Kalehbasti
2020 arXiv   pre-print
Specifically, unanswerable question-answers prove more effective in boosting the model: the F1 score gain from adding to the original dataset the answerable, unanswerable, and combined question-answers  ...  Question Answering (QA) is key for making possible a robust communication between human and machine.  ...  We use an unsupervised generatordiscriminator model based on cloze translation to generate answerable questions, following the work by Lewis et al.  ... 
arXiv:2010.01611v2 fatcat:6hdetreda5egharx3kfo7ok7ja

Unsupervised Pre-training for Biomedical Question Answering [article]

Vaishnavi Kommaraju, Karthick Gunasekaran, Kun Li, Trapit Bansal, Andrew McCallum, Ivana Williams, Ana-Maria Istrate
2020 arXiv   pre-print
We explore the suitability of unsupervised representation learning methods on biomedical text -- BioBERT, SciBERT, and BioSentVec -- for biomedical question answering.  ...  To further improve unsupervised representations for biomedical QA, we introduce a new pre-training task from unlabeled data designed to reason about biomedical entities in the context.  ...  by the National Science Foundation under Grant No.  ... 
arXiv:2009.12952v1 fatcat:lwjccrybcrcjpon3tnoxavazsa

Analysing the Effect of Masking Length Distribution of MLM: An Evaluation Framework and Case Study on Chinese MRC Datasets [article]

Changchang. Zeng, Shaobo. Li
2021 arXiv   pre-print
Thus, in MRC tasks with different answer lengths, whether the length of MLM is related to performance is a question worth studying.  ...  task, long multiple-choice cloze task; (2) four Chinese MRC datasets are created for these tasks; (3) we also have pre-trained four masked language models according to the answer length distributions  ...  Because some translations cannot find the answers in the original text (the answer translation and document translation are different), the amount of data is reduced compared to the original English version  ... 
arXiv:2110.15712v1 fatcat:owpminh76nchrocxhs6tusd27u

Unsupervised Dense Information Retrieval with Contrastive Learning [article]

Gautier Izacard and Mathilde Caron and Lucas Hosseini and Sebastian Riedel and Piotr Bojanowski and Armand Joulin and Edouard Grave
2022 arXiv   pre-print
However, they do not transfer well to new applications with no training data, and are outperformed by unsupervised term-frequency methods such as BM25.  ...  On the BEIR benchmark our unsupervised model outperforms BM25 on 11 out of 15 datasets for the Recall@100 metric.  ...  The MKQA dataset makes this possible by providing the same questions and answers in 26 languages.  ... 
arXiv:2112.09118v3 fatcat:6qgjos3jbjcdlcylwyjy2hwblq

Pre-training Text Representations as Meta Learning [article]

Shangwen Lv, Yuechen Wang, Daya Guo, Duyu Tang, Nan Duan, Fuqing Zhu, Ming Gong, Linjun Shou, Ryan Ma, Daxin Jiang, Guihong Cao, Ming Zhou (+1 others)
2020 arXiv   pre-print
However, existing approaches are optimized by minimizing a proxy objective, such as the negative log likelihood of language modeling.  ...  We study the problem in two settings: unsupervised pre-training and supervised pre-training with different pre-training objects to verify the generality of our approach.Experimental results show that our  ...  Question-answer pair matching aims to determine if the given answer can answer the question properly and question-question pair matching aims to determine if two questions have the same meaning.  ... 
arXiv:2004.05568v1 fatcat:n3w2gkc5jnex7nbgknok4zg46i

Deep learning based question answering system in Bengali

Tasmiah Tahsin Mayeesha, Abdullah Md Sarwar, Rashedur M. Rahman
2020 Journal of Information and Telecommunication  
Recent advances in the field of natural language processing has improved state-of-the-art performances on many tasks including question answering for languages like English.  ...  Unlike English, there is no benchmark large scale QA dataset collected for Bengali, no pretrained language model that can be modified for Bengali question answering and no human baseline score for QA has  ...  After translation we found by observation that the context, questions and answers were translated of high quality in general and the meanings of the sentences were preserved.  ... 
doi:10.1080/24751839.2020.1833136 fatcat:ltwrsufie5hrrezjtv2tu56fjy

Subword-augmented Embedding for Cloze Reading Comprehension [article]

Zhuosheng Zhang, Yafang Huang, Hai Zhao
2018 arXiv   pre-print
We also empirically explore different augmentation strategies on subword-augmented embedding to enhance the cloze-style reading comprehension model reader.  ...  In CMRC-2017, we observe questions with OOV answers (denoted as "OOV questions") account for 17.22% in the error results of the best Word + Char embedding based model.  ...  Sennrich et al. (2016) introduced the byte pair encoding (BPE) compression algorithm into neural machine translation for being capable of open-vocabulary translation by encoding rare and unknown words  ... 
arXiv:1806.09103v1 fatcat:fmpfonec7rha5k64p2rqjnu7l4

PQuAD: A Persian Question Answering Dataset [article]

Kasra Darvishi, Newsha Shahbodagh, Zahra Abbasiantaeb, Saeedeh Momtazi
2022 arXiv   pre-print
By releasing this dataset, we aim to ease research on Persian reading comprehension and development of Persian question answering systems.  ...  It includes 80,000 questions along with their answers, with 25% of the questions being adversarially unanswerable.  ...  ., 2020b) , a trained neural machine translation and a trained unsupervised word alignment model are developed for automatically translating the SQuAD dataset to Spanish.  ... 
arXiv:2202.06219v1 fatcat:h32u2w7znvbvvbtup3md4h6c2i

Bridging the Gap between Language Model and Reading Comprehension: Unsupervised MRC via Self-Supervision [article]

Ning Bian, Xianpei Han, Bo Chen, Hongyu Lin, Ben He, Le Sun
2021 arXiv   pre-print
The pre-training tasks for PLMs are not question-answering or MRC-based tasks, making existing PLMs unable to be directly used for unsupervised MRC.  ...  Firstly, we propose to learn to spot answer spans in documents via self-supervised learning, by designing a self-supervision pretext task for MRC - Spotting-MLM.  ...  Lewis et al. (2019) adopt an unsupervised translation model to transform cloze questions into natural questions.  ... 
arXiv:2107.08582v1 fatcat:kgtypobnbbghhh4tratbi37c7m

Language Model Augmented Relevance Score [article]

Ruibo Liu, Jason Wei, Soroush Vosoughi
2021 arXiv   pre-print
MARS leverages off-the-shelf language models, guided by reinforcement learning, to create augmented references that consider both the generation context and available human references, which are then used  ...  Question Answering. For question answering, we use the MOCHA dataset, 10 which includes human ratings on outputs of five models trained on six QA datasets (Chen et al., 2020) .  ...  NLG tasks, such as story generation, news summarization, and question-answering (Tao et al., 2018; Nema and Khapra, 2018) .  ... 
arXiv:2108.08485v1 fatcat:fzjtjd4nwfgk5m2q4uktbi2lp4

ERNIE: Enhanced Representation through Knowledge Integration [article]

Yu Sun, Shuohuan Wang, Yukun Li, Shikun Feng, Xuyi Chen, Han Zhang, Xin Tian, Danxiang Zhu, Hao Tian, Hua Wu
2019 arXiv   pre-print
answering.  ...  We also demonstrate that ERNIE has more powerful knowledge inference capacity on a cloze test.  ...  Retrieval Question Answering The goal of NLPCC-DBQA dataset ( http: //tcci.ccf.org.cn/conference/ 2016/dldoc/evagline2.pdf) is to select answers of the corresponding questions.  ... 
arXiv:1904.09223v1 fatcat:tgbhnpobindobkzv5zwpnw7kg4
« Previous Showing results 1 — 15 out of 357 results