Advancing Idiomatic Understanding: Evaluating GPT3.5 and Google Translate for PersianEnglish Translations
Abstract: Figurative language, especially idiomatic expressions, poses significant translation challenges due to its cultural and contextual nuances. Large Language Models (LLMs) like GPT-3.5 have shown greater capability in translating figurative language compared to state-of-the-art neural machine translation (NMT) systems. However, the impact of different prompting methods and combining NMTs and LLMs on idiom translation remains unexplored. This paper introduces two parallel datasets for Persian\rightarrow English and English\rightarrow Persian translation to address these challenges.
The Persian idiom examples are sampled from our PersianIdioms resource, which is compiled from an online dictionary and contains 2200 idioms with their meanings and popularity scores.
Using these datasets, we evaluate GPT models, Google Translate, and their combination, focusing on idiom translation accuracy, fluency, and contextual relevance. Additionally, we assess existing automatic evaluation metrics and GPT-3.5 and GPT-4 for evaluating idiomatic translations. Our results indicate that while Google Translate shows superior fluency, GPT-3.5 excels in accurately translating idioms. We also show that models are better at translating English idioms than Persian ones, and different configurations of models perform differently depending on the direction of translation.
We will release all our resources and annotations upon publication.
Paper Type: Long
Research Area: Machine Translation
Research Area Keywords: multilingualism, machine translation
Contribution Types: NLP engineering experiment, Data resources
Languages Studied: Persian,English
Submission Number: 5630
Loading