A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2021; you can also visit the original URL.
The file type is
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
This paper proposes an approach to crosslanguage sentence selection in a low-resource setting. It uses data augmentation and negative sampling techniques on noisy parallel sentence data to directly learn a cross-lingual embedding-based query relevance model. Results show that this approach performs as well as or better than multiple state-of-theart machine translation + monolingual retrieval systems trained on the same parallel data. Moreover, when a rationale training secondary objective isdoi:10.18653/v1/2021.acl-long.300 fatcat:2ccr75mx6rduxkqg6w4fijgygu