Est-ce que l'extraction des interrogatives du français peut-elle être automatisée ?

Published: 31 Oct 2023, Last Modified: 28 May 20255èmes journées du Groupement de Recherche CNRS “ Linguistique Informatique, Formelle et de Terrain ” (LIFT 2023)EveryoneCC BY 4.0
Abstract: The vast majority of linguistic corpus studies on French interrogatives retrieve the researched patterns by hand or only based on simple heuristics on raw text (e.g. interrogative words, question marks). In this paper, I present FUDIA (French UD Interrogative Annotator), a program able to detect French interrogatives from a corpus annotated in Universal Dependencies (UD). FUDIA is a rule-based graph rewriting system based on Grew. I inventory the obstacles to such an interrogative identification task and I explain how FUDIA solves most of them. I show that, coupled with a parser fine-tuned on similar data, FUDIA obtains good results on raw text (written and speech transcription).
Loading