Multilingual Coreference Resolution with Harmonized Annotations [article]

Ondřej Pražák, Miloslav Konopík, Jakub Sido
2021 arXiv   pre-print
In this paper, we present coreference resolution experiments with a newly created multilingual corpus CorefUD. We focus on the following languages: Czech, Russian, Polish, German, Spanish, and Catalan. In addition to monolingual experiments, we combine the training data in multilingual experiments and train two joined models -- for Slavic languages and for all the languages together. We rely on an end-to-end deep learning model that we slightly adapted for the CorefUD corpus. Our results show
more » ... at we can profit from harmonized annotations, and using joined models helps significantly for the languages with smaller training data.
arXiv:2107.12088v2 fatcat:re5j4t5yqvfh7ednnnuejfknke