Exploring Confidence-based Self-training for Multilingual Dependency Parsing in an Under-Resourced Language Scenario

Juntao Yu, Bernd Bohnet
2015 International Conference on Dependency Linguistics  
This paper presents a novel self-training approach that we use to explore a scenario which is typical for under-resourced languages. We apply self-training on small multilingual dependency corpora of nine languages. Our approach employs a confidence-based method to gain additional training data from large unlabeled datasets. The method has been shown effective for five languages out of the nine languages of the SPMRL Shared Task 2014 datasets. We obtained the largest absolute improvement of two
more » ... percentage points on Korean data. Our selftraining experiments show improvements upon the best state-of-the-art systems of the SPMRL shared task that employs one parser only.
dblp:conf/depling/YuB15 fatcat:bxa3kya3nzhejimoqx3o52kuda