A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Terminology-Aware Segmentation and Domain Feature for the WMT19 Biomedical Translation Task
2019
Proceedings of the Fourth Conference on Machine Translation (Volume 3: Shared Task Papers, Day 2)
In this work, we give a description of the TALP-UPC systems submitted for the WMT19 Biomedical Translation Task. Our proposed strategy is NMT model-independent and relies only on one ingredient, a biomedical terminology list. We first extracted such a terminology list by labelling biomedical words in our training dataset using the BabelNet API. Then, we designed a data preparation strategy to insert the terms information at a token level. Finally, we trained the Transformer model (Vaswani et
doi:10.18653/v1/w19-5418
dblp:conf/wmt/CarrinoRCF19
fatcat:rh37dboj5jawtgt3llrwhlb7nm