Abstract: Large Language Models (LLMs) have achieved impressive results in Machine Translation (MT). However, careful evaluations by human reveal that the translations produced by LLMs still contain multiple errors. Importantly, feeding back such error information into the LLMs can lead to self-correction and result in improved translation performance. Motivated by these insights, we introduce a systematic LLM-based self-correcting translation framework, named TER, which stands for Translate, Estimate, and Refine, marking a significant step forward in this direction. Our findings demonstrate that 1) our self-correction framework successfully assists LLMs in improving their translation quality across a wide range of languages, whether it's from high-resource languages to low-resource ones or whether it's English-centric or centered around other languages; 2) TER exhibits superior systematicity and interpretability compared to previous methods; 3) different estimation strategies yield varied impacts on AI feedback, directly affecting the effectiveness of the final corrections. We further compare different LLMs and conduct various experiments involving self-correction and cross-model correction to investigate the potential relationship between the translation and evaluation capabilities of LLMs. The code will be made available upon publication.
Paper Type: long
Research Area: Machine Translation
Languages Studied: English
Preprint Status: We are considering releasing a non-anonymous preprint in the next two months (i.e., during the reviewing process).
A1: yes
A1 Elaboration For Yes Or No: We attach limitations after the main text.
A2: yes
A2 Elaboration For Yes Or No: In our limitations. Furthermore, getting reliable and consistent outputs from generative language models is a known problem: https://openreview.net/forum?id=98p5x51L5af.
A3: yes
A3 Elaboration For Yes Or No: Our abstract summarizes the main results and takeaways.
B: yes
B1: yes
B1 Elaboration For Yes Or No: Section 3
B2: n/a
B3: n/a
B3 Elaboration For Yes Or No: While large language models are not directly intended to be used for machine translation, self-correcting and many other downstream NLP tasks, they have been shown to be able to self-correct its own translation. Our work studies this phenomenon and provides a framework and analysis to improve this capability.
B4: n/a
B5: yes
B5 Elaboration For Yes Or No: Section 3
B6: yes
B6 Elaboration For Yes Or No: Section 3
C: yes
C1: yes
C1 Elaboration For Yes Or No: Section 3
C2: yes
C2 Elaboration For Yes Or No: Section 3
C3: yes
C3 Elaboration For Yes Or No: Section 4, 5
C4: yes
C4 Elaboration For Yes Or No: Section 3
D: yes
D1: yes
D1 Elaboration For Yes Or No: Section 3
D2: yes
D2 Elaboration For Yes Or No: Section 3. We recruited graduate students volunteers
D3: no
D3 Elaboration For Yes Or No: We have notified the usage of the data before recruiting volunteers.
D4: n/a
D5: n/a
E: no
E1: n/a
0 Replies
Loading