Evaluating Coreference Consistency in Chinese-to-English Dialogue Translation

Evaluating Coreference Consistency in Chinese-to-English Dialogue Translation

ACL ARR 2026 January Submission9054 Authors

06 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Machine Translation, Large Language Model, Coreference Consistency

Abstract: Coreference consistency is essential for preserving discourse coherence in dialogue translation, yet it remains underexplored in the context of large language models (LLMs). In this paper, we present a comprehensive study of coreference preservation in Chinese-to-English dialogue translation. We construct a new dataset based on the RiSAWOZ corpus, annotated with source-side zero pronouns and coreference chains. Translations are obtained from a diverse set of LLMs of varying sizes. To evaluate whether coreference relations are maintained, we propose the **Coreference Consistency Score** (CCS), which quantifies the extent to which translated mentions preserve original coreferential chains. We further introduce an LLM-as-Judge protocol for assessing the faithfulness of translated mentions. Our results reveal that, despite producing fluent translations, many LLMs, including both small and large models, struggle to maintain cross-lingual coreference, highlighting the need for more discourse-aware dialogue translation models. All code and data will be released.

Paper Type: Long

Research Area: Machine Translation

Research Area Keywords: Machine Translation

Languages Studied: Chinese, English

Submission Number: 9054

Loading