RAG Picking Helps: Retrieval Augmented Generation for Machine Translation

RAG Picking Helps: Retrieval Augmented Generation for Machine Translation

ACL ARR 2024 December Submission743 Authors

15 Dec 2024 (modified: 05 Feb 2025)ACL ARR 2024 December SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Machine Translation (MT) has considerably improved over the years, especially with the introduction of neural approaches. However, such approaches lack the ability to tackle scenarios like domain adaptation and low-resource settings due to their dependence on only their parametric knowledge. We explore using a retrieval mechanism for MT and provide a detailed analysis of the quantitative and qualitative improvements obtained by its use. We introduce RAGMT, a retrieval augmented generation (RAG)-based multi-task fine-tuning approach for MT using non-parametric knowledge sources. We also propose using new auxiliary training objectives that improve the performance of RAG for domain-specific MT. To the best of our knowledge, we are the first to adapt the RAG framework with a multi-task training objective for MT to support end-to-end training. Our experiments demonstrate that retrieval-augmented fine-tuning of MT models under the RAGMT framework results in an average improvement of 12.90 BLEU scores compared to simple fine-tuning approaches on English-German domain-specific translation. We also demonstrate RAGMT's ability to exploit in-domain knowledge bases versus domain-agnostic ones and perform careful ablations over the model components. Qualitatively, RAGMT is easily interpretable, stylistically aligns translation outputs to the domain of interest, and appears to demonstrate ``copy-over-translation'' behaviour with respect to named entities.

Paper Type: Long

Research Area: Machine Translation

Research Area Keywords: domain adaptation, retrieval-augmented generation

Contribution Types: Model analysis & interpretability, Approaches to low-resource settings, Publicly available software and/or pre-trained models, Data analysis

Languages Studied: Research Area

Submission Number: 743

Loading