DMWE:Differential Testing for Multi-word Expressions in Machine Translation

ACL ARR 2025 February Submission5763 Authors

16 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: With the advancement of deep neural networks, machine translation has greatly improved. Nowadays, people widely use machine translation tools to facilitate tasks such as reviewing foreign documents. However, due to the complexity of neural networks, translation errors can occur, leading to misunderstandings or conflicts. Existing machine translation systems often focus on sentence coherence, neglecting phrase translation accuracy, and most testing methods concentrate on the sentence hierarchy. This paper investigates multi-word expressions, a specific form of phrases prone to errors, and proposes Differential Multi-Word Expression testing method for machine translation (DMWE). We evaluated multi-word expressions by comparing their translation similarity across different translation software, based on the idea that phrase translations within the same sentence should be similar. Using three common types of multi-word expressions—Noun + Noun, Adjective + Noun, and Verb + Noun—we tested 1498, 1372, and 1525 sentences with Google Translate, Microsoft Bing Translator, and Baidu Translate. The results show that DMWE performs well in detecting translation errors with high precision.
Paper Type: Long
Research Area: Machine Translation
Research Area Keywords: automatic evaluation
Contribution Types: NLP engineering experiment, Publicly available software and/or pre-trained models
Languages Studied: chinese,english
Submission Number: 5763
Loading