A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2022; you can also visit the original URL.
The file type is application/pdf
.
Detecting Segmentation Errors in Chinese Annotated Corpus
2005
Workshop on Chinese Language Processing
This paper proposes a semi-automatic method to detect segmentation errors in a manually annotated Chinese corpus in order to improve its quality further. A particular Chinese character string occurring more than once in a corpus may be assigned different segmentations during a segmentation process. Based on these differences our approach outputs the segmentation error candidates found in a segmented corpus and then on which the segmentation errors are identified manually. Segmentation error
dblp:conf/acl-sighan/SunHWL05
fatcat:xto7z5seajdolh5qjawehhymki