Unsupervised learning of agglutinated morphology using nested Pitman-Yor process based morpheme induction algorithm

Arun Kumar, Lluis Padro, Antoni Oliver
2015 2015 International Conference on Asian Language Processing (IALP)  
In this paper we describe a method to morphologically segment highly agglutinating and inflectional languages from Dravidian family. We use nested Pitman-Yor process to segment long agglutinated words into their basic components, and use a corpus based morpheme induction algorithm to perform morpheme segmentation. We test our method in two languages, Malayalam and Kannada and compare the results with Morfessor.
doi:10.1109/ialp.2015.7451528 dblp:conf/ialp/KumarPO15 fatcat:feva73iv25c53nz2jhipjzkx7i