Supervised Morphology Generation Using Parallel Corpus

Alireza Mahmoudi, Mohsen Arabsorkhi, Heshaam Faili
2013 Recent Advances in Natural Language Processing  
Translating from English, a morphologically poor language, into morphologically rich languages such as Persian comes with many challenges. In this paper, we present an approach to rich morphology prediction using a parallel corpus. We focus on the verb conjugation as the most important and problematic phenomenon in the context of morphology in Persian. We define a set of linguistic features using both English and Persian linguistic information, and use an English-Persian parallel corpus to
more » ... our model. Then, we predict six morphological features of the verb and generate inflected verb form using its lemma. In our experiments, we generate verb form with the most common feature values as a baseline. The results of our experiments show an improvement of almost 2.1% absolute BLEU score on a test set containing 16K sentences.
dblp:conf/ranlp/MahmoudiAF13 fatcat:ta5nzpcpebgcpc34dkp4ndurey