From variation to harmonization: the UD Turkic Group initiative

Published: 27 May 2026, Last Modified: 27 May 2026UniDive 2026EveryoneRevisionsCC BY-SA 4.0
Keywords: Universal Dependencies, Turkic languages
Working Group: WG1: Corpus annotation, WG4: Quantifying and promoting diversity
WG1 Tasks: Task 1.1: Linguistic typology and multilingual corpus annotation, Task 1.3: Extensions and updates to morphosyntactic annotation guidelines
Abstract: This abstract summarizes the activities of the UD Turkic Group, established in September 2023 within the framework of the UniDive COST Action (CA21167) and operating primarily within WG1. The group aims to harmonize annotations across Universal Dependencies (UD) treebanks for Tur- kic languages and to develop guidelines for future initiatives. As of UD v2.17, 26 treebanks covering 12 Tur- kic languages or varieties exhibit considerable cross-treebank variation. These discrepancies arise from both unresolved issues in linguistic analysis and independent design choices across annotation efforts. Although Turkic languages share typologi- cally and genetically grounded features, these are not always analysed uniformly, due to differing linguistic traditions and gaps in language-specific descriptions. To address this, the group conducts collaborative cross-linguistic analysis with the goal of reconcil- ing divergent practices and formulating recommen- dations for consistent and interoperable annotation across Turkic UD treebanks.
WG4 Tasks: Task 4.1: Promoting low-resourced/endangered languages
Tracks For Type Of Contribution: Work in progress
Do You Need Visa To Attend The 4th UniDive General Meeting In Romania: No
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 64
Loading