Formal Data Structures for Tabular Formats in Language TechnologyDownload PDF

12 Mar 2021 (modified: 05 May 2023)Submitted to ESWC2021 P&DReaders: Everyone
Keywords: language technology, data models, CoNLL-RDF, ontology
TL;DR: We present an ontology that allows machine-readable description of tabular data formats that are widely used for natural language processing
Abstract: In language technology and the language sciences, tabular formats with tab-separated values (TSV) represent a frequently used formalism to represent linguistically annotated natural language, often addressed as “CoNLL formats”. To facilitate interoperability between them, the CoNLL-RDF ontology provides a machine-readable description of 24 such formats that can be used for the conversion between the respective TSV formats and the automated assessment of conversion quality and gaps.
