APAF: A Framework for Evaluating Argumentation Preservation in Machine Translation

APAF: A Framework for Evaluating Argumentation Preservation in Machine Translation

ACL ARR 2025 May Submission5710 Authors

20 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Contemporary machine translation systems excel at preserving semantic content but inadequately address discourse-level argumentative structures critical for specialized communications. We introduce the Argumentation Preservation Assessment Framework (APAF), a novel evaluation approach that quantifies how effectively translations maintain the logical architecture of arguments across languages. APAF identifies and categorizes argumentative elements (claims, premises, examples) in source and target texts, employs neural embeddings for cross-lingual comparison, and calculates comprehensive preservation metrics. Through evaluation on Chinese-English legal translations, we demonstrate that argumentation preservation represents a distinct quality dimension not captured by conventional metrics. Results reveal that while commercial systems and large language models perform reasonably well (CAPS scores 0.66-0.73), they achieve only 72-80\% of human-level performance (0.91), with relationship preservation consistently lagging behind component preservation. Our framework enables systematic assessment of a critical but previously unmeasured dimension of translation quality, particularly valuable for domains where argumentative integrity directly impacts functional efficacy.

Paper Type: Long

Research Area: Machine Translation

Research Area Keywords: Machine Translation, Automatic Evaluation, Argument Mining

Contribution Types: NLP engineering experiment, Approaches to low-resource settings, Data resources, Data analysis

Languages Studied: Chinese, English

Submission Number: 5710

Loading