A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2021; you can also visit the original URL.
The file type is application/pdf
.
Better Neural Machine Translation by Extracting Linguistic Information from BERT
2021
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume
unpublished
Adding linguistic information (syntax or semantics) to neural machine translation (NMT) has mostly focused on using point estimates from pre-trained models. Directly using the capacity of massive pre-trained contextual word embedding models such as BERT (Devlin et al., 2019) has been marginally useful in NMT because effective fine-tuning is difficult to obtain for NMT without making training brittle and unreliable. We augment NMT by extracting dense fine-tuned vector-based linguistic
doi:10.18653/v1/2021.eacl-main.241
fatcat:yl6q3tuq3baydczoeznbglg3ty