A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Detecting Fine-Grained Cross-Lingual Semantic Divergences without Supervision by Learning to Rank
[article]
2020
arXiv
pre-print
Detecting fine-grained differences in content conveyed in different languages matters for cross-lingual NLP and multilingual corpora analysis, but it is a challenging machine learning problem since annotation is expensive and hard to scale. This work improves the prediction and annotation of fine-grained semantic divergences. We introduce a training strategy for multilingual BERT models by learning to rank synthetic divergent examples of varying granularity. We evaluate our models on the
arXiv:2010.03662v1
fatcat:cqpq5cc7xfd7bc2ukkrfx5lwli