A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is
Proceedings of the 15th conference on Computational linguistics -
This article presents a novel Chinese class n-gram model for contextual postprocessing of haudwriting recognition results. The word classes in the model are automatically discovered by a corpus-based simulated anuealing procedure. Three other language models, least-word, word-frequency, and the powerflfl interword character bigram model, have been constructed for comparison. Extensive experiments on large text corpora show that the discovered class bigram model outperforms the other three competing models.doi:10.3115/991250.991350 fatcat:vzbxgde5kbbyrncbr5cq2iftje