Improving Translation Faithfulness of Large Language Models via Augmenting Instructions

Improving Translation Faithfulness of Large Language Models via Augmenting Instructions

ACL ARR 2024 June Submission3446 Authors

16 Jun 2024 (modified: 13 Aug 2024)ACL ARR 2024 June SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Large Language Models (LLMs) present strong general capabilities, and a current compelling challenge is stimulating their specialized capabilities, such as machine translation through low-cost instruction tuning. The standard data for instruction-following is organized as a con-catenated series of instructions, inputs, and outputs. Due to the inherent pattern of the attention mechanism in LLMs, these models tend to concentrate more on nearby tokens. Consequently, there is a high risk of forgetting instructions during the decoding process, particularly when dealing with long contexts. To alleviate the instruction forgetting issue on translation, we propose SWIE (Segment-Weighted Instruction Embedding) and an instruction-following dataset OVERUNDER. SWIE improves the model instruction understanding by adding an instruction representation on the following input and response representations. OVERUNDER improves model faithfulness by comparing over-translation and under-translation samples with the correct translation. We apply our methods to two mainstream open-source LLMs, i.e., BLOOM and LLaMA. Experimental results demonstrate that using SWIE and OVERUNDER in models improves translation performance and faithfulness over the strong baselines. Furthermore, SWIE improves the model performance on various long-context scenarios, including in-context translation, translation on language direction in the instruction-tuning corpus, and translation on zero-shot language pairs. The effectiveness of SWIE is demonstrated on the IFEval instruction-following test set, indicating its potential for broader task applicability.

Paper Type: Long

Research Area: Machine Translation

Research Area Keywords: modeling, fine-tuning

Contribution Types: NLP engineering experiment

Languages Studied: English; Chinese; German; Czech; Japanese; Korean

Submission Number: 3446

Loading