Translate Smart, not Hard: Cascaded Translation Systems with Quality-Aware Deferral

ACL ARR 2025 February Submission1361 Authors

13 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Larger models often outperform smaller ones but come with high computational costs. Cascading offers a potential solution. By default, it uses smaller models and defers only some instances to larger, more powerful models. However, designing effective deferral rules remains a challenge. In this paper, we propose a simple yet effective approach for machine translation, using existing quality estimation (QE) metrics as deferral rules. We show that QE-based deferral allows a cascaded system to match the performance of a larger model while invoking it for a small fraction ($30\%$ to $50\%$) of the examples, significantly reducing computational costs. We validate this approach through both automatic and human evaluation.
Paper Type: Short
Research Area: Machine Translation
Research Area Keywords: Machine translation, efficiency, quality estimation, cascaded systems
Languages Studied: English, Czech, German, Spanish, Hindi, Icelandic, Japanese, Russian, Ukrainian
Submission Number: 1361
Loading