Generalizing Question Answering System with Pre-trained Language Model Fine-tuning

Dan Su, Yan Xu, Genta Indra Winata, Peng Xu, Hyeondey Kim, Zihan Liu, Pascale Fung
2019 Proceedings of the 2nd Workshop on Machine Reading for Question Answering  
With a large number of datasets being released and new techniques being proposed, Question answering (QA) systems have witnessed great breakthroughs in reading comprehension (RC) tasks. However, most existing methods focus on improving in-domain performance, leaving open the research question of how these models and techniques can generalize to out-ofdomain and unseen RC tasks. To enhance the generalization ability, we propose a multi-task learning framework that learns the shared
more » ... across different tasks. Our model is built on top of a large pre-trained language model, such as XLNet, and then fine-tuned on multiple RC datasets. Experimental results show the effectiveness of our methods, with an average Exact Match score of 56.59 and an average F1 score of 68.98, which significantly improves the BERT-Large baseline by 8.39 and 7.22, respectively. * * These two authors contributed equally.
doi:10.18653/v1/d19-5827 dblp:conf/acl-mrqa/SuXWXKLF19 fatcat:wtcgfgoua5hejot4jmeefucag4