Evaluating the Impact of Verbal Multiword Expressions on Machine Translation
Keywords: evaluation, evaluation methodologies, multiword expressions, machine translation
TL;DR: Our work evaluates the impact of verbal multiword expressions on machine translation quality, across seven language pairs using eight machine translation systems.
Abstract: Verbal multiword expressions (VMWEs) present significant challenges for natural language processing due to their complex and often non-compositional nature. While machine translation models have seen significant improvement with the advent of language models in recent years, accurately translating these complex linguistic structures remains an open problem. In this study, we analyze the impact of three VMWE categories---verbal idioms, verb-particle constructions, and light verb constructions---on machine translation quality from English to multiple languages. Using established multiword expression datasets and standard machine translation datasets, we evaluate how state-of-the-art translation systems handle these expressions. Our experimental results consistently show that VMWEs negatively affect translation quality, with deeper analysis indicating that this degradation is primarily attributable to the VMWE itself rather than general sentence-level difficulty.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 21
Loading