A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2022; you can also visit the original URL.
The file type is application/pdf
.
Weakly Supervised Morphology Learning for Agglutinating Languages Using Small Training Sets
2010
International Conference on Computational Linguistics
The paper describes a weakly supervised approach for decomposing words into all morphemes: stems, prefixes and suffixes, using wordforms with marked stems as training data. As we concentrate on under-resourced languages, the amount of training data is limited and we need some amount of supervision in the form of a small number of wordforms with marked stems. In the first stage we introduce a new Supervised Stem Extraction algorithm (SSE). Once stems have been extracted, an improved unsupervised
dblp:conf/coling/ShalonovaG10
fatcat:wedjkunttvhorpx7p3uzlu3d6i