A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2018; you can also visit the original URL.
The file type is application/pdf
.
Orthographic Syllable as basic unit for SMT between Related Languages
2016
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing
We explore the use of the orthographic syllable, a variable-length consonant-vowel sequence, as a basic unit of translation between related languages which use abugida or alphabetic scripts. We show that orthographic syllable level translation significantly outperforms models trained over other basic units (word, morpheme and character) when training over small parallel corpora.
doi:10.18653/v1/d16-1196
dblp:conf/emnlp/KunchukuttanB16
fatcat:n5zd67iw3nhn7amdbcyvjmg3ie