Abstract: One of the most crucial Natural Language Processing (NLP) tasks is associated with the universality-driven development of language resources for different languages (e.g., Universal Dependencies (UD), UniMorph, PARSEME etc.). Among these resources the Universal Dependencies (UD) Initiative provides cross-linguistically consistent grammatical annotation for different languages and the possibility to share syntactic Treebanks online. The above-mentioned resources lack data on Kartvelian languages. The present paper describes some of the issues associated with the development of the Syntactic Treebank for Georgian implemented as a part of the project on the Georgian morphosyntactic computational analysis and tools for the annotation of universal syntactic dependencies. This investigation pays special attention to the prerequisites and mapping of tagsets and the syntactic annotation of Georgian concerning the initial syntactic Treebank (README.md etc.) and language-specific documentation files (introduction.md and index.md). The annotated files as well as language documentation files are already uploaded to the GitHub repository and used for the training of the UDpipe model. These preliminary results can be considered as significant foundation for the future research not only of Georgian, but also of other Kartvelian languages.
External IDs:dblp:conf/tbillc/LobzhanidzeMBGJ23
Loading