A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2019; you can also visit the original URL.
The file type is
We present a character-based model for joint segmentation and POS tagging for Chinese. The bidirectional RNN-CRF architecture for general sequence tagging is adapted and applied with novel vector representations of Chinese characters that capture rich contextual information and lower-than-character level features. The proposed model is extensively evaluated and compared with a state-of-the-art tagger respectively on CTB5, CTB9 and UD Chinese. The experimental results indicate that our model isarXiv:1704.01314v3 fatcat:vvztyzjryzckxf7fulgxlfsfx4