Improving LLM-based Machine Translation with Systematic Self-Correction

Anonymous

Improving LLM-based Machine Translation with Systematic Self-Correction

Anonymous

16 Feb 2024ACL ARR 2024 February Blind SubmissionReaders: Everyone

Abstract: Large Language Models (LLMs) have achieved impressive results in Machine Translation (MT). However, careful evaluations by human reveal that the translations produced by LLMs still contain multiple errors. Importantly, feeding back such error information into the LLMs can lead to self-correction and result in improved translation performance. Motivated by these insights, we introduce a systematic LLM-based self-correcting translation framework, named TER, which stands for Translate, Estimate, and Refine, marking a significant step forward in this direction. Our findings demonstrate that 1) our self-correction framework successfully assists LLMs in improving their translation quality across a wide range of languages, whether it's from high-resource languages to low-resource ones or whether it's English-centric or centered around other languages; 2) TER exhibits superior systematicity and interpretability compared to previous methods; 3) different estimation strategies yield varied impacts on AI feedback, directly affecting the effectiveness of the final corrections. We further compare different LLMs and conduct various experiments involving self-correction and cross-model correction to investigate the potential relationship between the translation and evaluation capabilities of LLMs. The code will be made available upon publication.

Paper Type: long

Research Area: Machine Translation

Languages Studied: English

Preprint Status: We are considering releasing a non-anonymous preprint in the next two months (i.e., during the reviewing process).

A1: yes

A1 Elaboration For Yes Or No: We attach limitations after the main text.

A2: yes

A2 Elaboration For Yes Or No: In our limitations. Furthermore, getting reliable and consistent outputs from generative language models is a known problem: https://openreview.net/forum?id=98p5x51L5af.

A3: yes

A3 Elaboration For Yes Or No: Our abstract summarizes the main results and takeaways.

B: yes

B1: yes

B1 Elaboration For Yes Or No: Section 3

B2: n/a

B3: n/a

B3 Elaboration For Yes Or No: While large language models are not directly intended to be used for machine translation, self-correcting and many other downstream NLP tasks, they have been shown to be able to self-correct its own translation. Our work studies this phenomenon and provides a framework and analysis to improve this capability.

B4: n/a

B5: yes

B5 Elaboration For Yes Or No: Section 3

B6: yes

B6 Elaboration For Yes Or No: Section 3

C: yes

C1: yes

C1 Elaboration For Yes Or No: Section 3

C2: yes

C2 Elaboration For Yes Or No: Section 3

C3: yes

C3 Elaboration For Yes Or No: Section 4, 5

C4: yes

C4 Elaboration For Yes Or No: Section 3

D: yes

D1: yes

D1 Elaboration For Yes Or No: Section 3

D2: yes

D2 Elaboration For Yes Or No: Section 3. We recruited graduate students volunteers

D3: no

D3 Elaboration For Yes Or No: We have notified the usage of the data before recruiting volunteers.

D4: n/a

D5: n/a

E: no

E1: n/a

0 Replies

Loading