ANCOR_Centre, a large free spoken French coreference corpus: description of the resource and reliability measuresDownload PDFOpen Website

Published: 01 Jan 2014, Last Modified: 22 Dec 2023LREC 2014Readers: Everyone
Abstract: This article presents ANCOR_Centre, a French coreference corpus, available under the Creative Commons Licence. With a size of around 500,000 words, the corpus is large enough to serve the needs of data-driven approaches in NLP and represents one of the largest coreference resources currently available. The corpus focuses exclusively on spoken language, it aims at representing a certain variety of spoken genders. ANCOR_Centre includes anaphora as well as coreference relations which involve nominal and pronominal mentions. The paper describes into details the annotation scheme and the reliability measures computed on the resource.
0 Replies

Loading