A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is
We present a model of morphological segmentation that jointly learns to segment and restore orthographic changes, e.g., funniest → fun-y-est. We term this form of analysis canonical segmentation and contrast it with the traditional surface segmentation, which segments a surface form into a sequence of substrings, e.g., funniest → funn-i-est. We derive an importance sampling algorithm for approximate inference in the model and report experimental results on English, German and Indonesian.doi:10.18653/v1/n16-1080 dblp:conf/naacl/CotterellVS16 fatcat:mx5umgpmtfh6fn4447ynlnxd44