Adaptive Compression-based Approach for Chinese Pinyin Input

Jin Hu Huang, David M. W. Powers
2004 Workshop on Chinese Language Processing  
This article presents a compression-based adaptive algorithm for Chinese Pinyin input. There are many different input methods for Chinese character text and the phonetic Pinyin input method is the one most commonly used. Compression by Partial Match (PPM) is an adaptive statistical modelling technique that is widely used in the field of text compression. Compression-based approaches are able to build models very efficiently and incrementally. Experiments show that adaptive compressionbased
more » ... ach for Pinyin input outperforms modified Kneser-Ney smoothing method implemented by SRILM language tools (Stolcke, 2002) .
dblp:conf/acl-sighan/HuangP04 fatcat:ulknfvhm4rgvzl3d47ennu3fr4