Don't Take it Literally! Idiom-aware Vietnamese Translation via In-context Learning

ACL ARR 2025 July Submission1148 Authors

29 Jul 2025 (modified: 21 Aug 2025)ACL ARR 2025 July SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: The translation of idiomatic expressions often results in misunderstandings and inaccuracies, affecting both everyday communication and machine translation. This paper introduces Idiom-aware Vietnamese Translation (IDiAT), a new framework for the evaluation of idiomatic translation for Vietnamese, along with state-of-the-art results for this task. We collect and curate a high-quality Vietnamese-English idiom set that serves as a resource for in-context learning (ICL) in Vietnamese translation. IDiAT's evaluation benchmark includes both idiomatic and non-idiomatic text pairs to assess general translation quality and idiomatic translation performance. We leverage ICL in large language models, using IDiAT to enhance few-shot demonstrations with idiom and topic descriptions, improving translation accuracy. Empirical results demonstrate that our IDiAT-based ICL outperforms traditional methods while requiring fewer data samples, and human evaluations confirm its effectiveness. Though focusing on the Vietnamese language, our proposed idiom-based ICL approach advances idiomatic translation and contributes to the development of culturally aware translation systems, paving the way for future research in low-resource languages. The experimental materials will be publicly available for research purposes.
Paper Type: Long
Research Area: Efficient/Low-Resource Methods for NLP
Research Area Keywords: Efficient/Low-Resource Methods for NLP, Resources and Evaluation, Machine Translation
Contribution Types: Approaches to low-resource settings, Approaches low compute settings-efficiency, Data resources
Languages Studied: English, Vietnamese, Japanese, Korean, Thai, Finnish, Slovenian
Previous URL: https://openreview.net/forum?id=JW20SOIEJK
Explanation Of Revisions PDF: pdf
Reassignment Request Area Chair: Yes, I want a different area chair for our submission
Reassignment Request Reviewers: Yes, I want a different set of reviewers
Justification For Not Keeping Action Editor Or Reviewers: In the previous review cycle, a reviewer appeared to overlook key contributions of our work. This misrepresentation suggests a lack of careful reading or domain familiarity. For this reason, we respectfully request a different area chair and a new set of reviewers to ensure a fair and informed evaluation.
A1 Limitations Section: This paper has a limitations section.
A2 Potential Risks: N/A
B Use Or Create Scientific Artifacts: Yes
B1 Cite Creators Of Artifacts: Yes
B1 Elaboration: Section 2, 4, 6
B2 Discuss The License For Artifacts: N/A
B3 Artifact Use Consistent With Intended Use: N/A
B4 Data Contains Personally Identifying Info Or Offensive Content: N/A
B5 Documentation Of Artifacts: N/A
B6 Statistics For Data: Yes
B6 Elaboration: Section 2
C Computational Experiments: Yes
C1 Model Size And Budget: Yes
C1 Elaboration: Section 4, and Appendix
C2 Experimental Setup And Hyperparameters: Yes
C2 Elaboration: Section 4, and Appendix
C3 Descriptive Statistics: N/A
C4 Parameters For Packages: Yes
C4 Elaboration: Section 4
D Human Subjects Including Annotators: Yes
D1 Instructions Given To Participants: Yes
D1 Elaboration: Section 4, and Appendix
D2 Recruitment And Payment: Yes
D2 Elaboration: Section 4, and 5
D3 Data Consent: N/A
D4 Ethics Review Board Approval: N/A
D5 Characteristics Of Annotators: N/A
E Ai Assistants In Research Or Writing: Yes
E1 Information About Use Of Ai Assistants: N/A
E1 Elaboration: We use ChatGPT and Grammarly for correct the grammars as well as our writing.
Author Submission Checklist: yes
Submission Number: 1148
Loading