Abstract: Contemporary machine translation systems excel at preserving semantic content but inadequately address discourse-level argumentative structures critical for specialized communications. We introduce the Argumentation Preservation Assessment Framework (APAF), a novel evaluation approach that quantifies how effectively translations maintain the logical architecture of arguments across languages. APAF identifies and categorizes argumentative elements (claims, premises, examples) in source and target texts, employs neural embeddings for cross-lingual comparison, and calculates comprehensive preservation metrics. Through evaluation on Chinese-English legal translations, we demonstrate that argumentation preservation represents a distinct quality dimension not captured by conventional metrics. Results reveal that while commercial systems and large language models perform reasonably well (CAPS scores 0.66-0.73), they achieve only 72-80\% of human-level performance (0.91), with relationship preservation consistently lagging behind component preservation. Our framework enables systematic assessment of a critical but previously unmeasured dimension of translation quality, particularly valuable for domains where argumentative integrity directly impacts functional efficacy.
Paper Type: Long
Research Area: Machine Translation
Research Area Keywords: Machine Translation, Automatic Evaluation, Argument Mining
Contribution Types: NLP engineering experiment, Approaches to low-resource settings, Data resources, Data analysis
Languages Studied: Chinese, English
Submission Number: 5710
Loading