RuCoCo: a new Russian corpus with coreference annotation

Published: 09 Jun 2022, Last Modified: 31 Jan 2024OpenReview Archive Direct UploadEveryoneCC BY-NC-ND 4.0
Abstract: We present a new corpus with coreference annotation, Russian Coreference Corpus (RuCoCo). The goal of RuCoCo is to obtain a large number of annotated texts while maintaining high inter-annotator agreement. RuCoCo contains news texts in Russian, part of which were annotated from scratch, and for the rest the machine-generated annotations were refined by human annotators. The size of our corpus is one million words and around 150,000 mentions. We make the corpus publicly available.
Loading