RuCoCo: a new Russian corpus with coreference annotation
RuCoCo: новый русскоязычный корпус кореференции

Vladimir Dobrovolskii, ABBYY, Mariia Michurina, Alexandra Ivoylova, MIPT, RSUH, MIPT, RSUH
2022 COMPUTATIONAL LINGUISTICS AND INTELLECTUAL TECHNOLOGIES   unpublished
We present a new corpus with coreference annotation, Russian Coreference Corpus (RuCoCo). The goal of RuCoCo is to obtain a large number of annotated texts while maintaining high inter-annotator agreement. RuCoCo contains news texts in Russian, part of which were annotated from scratch, and for the rest the machine-generated annotations were refined by human annotators. The size of our corpus is one million words and around 150,000 mentions. We make the corpus publicly available.
doi:10.28995/2075-7182-2022-21-141-149 fatcat:eyy2i75fvbeibigi57xxdp4aoy