Keywords: verbal periphrasis, CoNLL-U, phrase-level enrichment, modality, conation, phase
Working Group: WG1: Corpus annotation, WG2: Lexicon-corpus interface, WG3: Multilingual and cross-lingual language technology, WG4: Quantifying and promoting diversity
Abstract: This paper examines how verbal periphrastic constructions can be automatically identified in Universal Dependencies (UD) treebanks and how CoNLL-U files can be enriched with phrase-level tags that make such constructions more explicitly retrievable. Focusing on periphrases expressing modality, conation, and phase, the study combines structural cues from UD annotation with lexical information, especially recurrent verb lemmas. A script was developed to search for these patterns and assign functional tags in parsed corpora. The procedure was tested on the Porttinari treebank for Brazilian Portuguese and the AnCora treebank for Spanish. The results support the use of combined structural and lexical criteria for the automatic extraction of verbal periphrasis and suggest the value of such enrichment both for treebank annotation and for text analysis across languages and genres.
Tracks For Type Of Contribution: Work in progress
Do You Need Visa To Attend The 4th UniDive General Meeting In Romania: No
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 56
Loading