Character-based Joint Segmentation and POS Tagging for Chinese using Bidirectional RNN-CRF [article]

Yan Shao and Christian Hardmeier and Jörg Tiedemann and Joakim Nivre
2017 arXiv   pre-print
We present a character-based model for joint segmentation and POS tagging for Chinese. The bidirectional RNN-CRF architecture for general sequence tagging is adapted and applied with novel vector representations of Chinese characters that capture rich contextual information and lower-than-character level features. The proposed model is extensively evaluated and compared with a state-of-the-art tagger respectively on CTB5, CTB9 and UD Chinese. The experimental results indicate that our model is
more » ... ccurate and robust across datasets in different sizes, genres and annotation schemes. We obtain state-of-the-art performance on CTB5, achieving 94.38 F1-score for joint segmentation and POS tagging.
arXiv:1704.01314v3 fatcat:vvztyzjryzckxf7fulgxlfsfx4