A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2022; you can also visit the original URL.
The file type is application/pdf
.
A Cascaded Approach for CIPS-SIGHAN Micro-Blog Word Segmentation Bakeoff 2012
2012
Workshop on Chinese Language Processing
The state-of-the-art Chinese word segmentation systems have achieved high performance on well-formed long document. However, the segmentation for microblog is difficult due to the noise problem and the OOV problem. In this paper, we present a Chinese Micro-Blog Segmentation system for the CIP-SIGHAN Word Segmentation Bakeoff 2012 track. The proposed system adopts a cascaded approach which contains three steps, correspondingly the preprocessing, the word segmentation and the post-processing. In
dblp:conf/acl-sighan/ShiHS12
fatcat:5xkc6fgu7jeojgxvnmo7itwlru