Benchmarking and Improving Long-Text Translation with Large Language Models

Anonymous

Benchmarking and Improving Long-Text Translation with Large Language Models

Anonymous

16 Feb 2024ACL ARR 2024 February Blind SubmissionReaders: Everyone

Abstract: Recent studies have illuminated the promising capabilities of large language models (LLMs) in handling long texts. However, their performance in machine translation (MT) of long documents remains underexplored. This paper aims to shed light on how LLMs navigate this complex task, offering a comprehensive evaluation of their capabilities and limitations in long-text MT. First, we collect and construct an instruction-based benchmark dataset, specifically designed for the finetuning and evaluation of LLMs, encompassing multilingual, multi-domain, and document-level parallel data. Second, we conduct a comprehensive comparison between MT and LLM models concerning document-level translation. Our analysis uncovers that LLMs exhibit shortcomings in long-text domains, and their performance diminishes as document size escalates. By exploiting various extrapolation strategies, we enhance the capacity of LLMs to translate longer texts. We will release data, code, and models, which we hope can promote research in this field.

Paper Type: long

Research Area: Special Theme (conference specific)

Contribution Types: NLP engineering experiment, Reproduction study, Data resources

Languages Studied: English, German, Russian, Chinese

0 Replies

Loading