The Green KNIGHT: Green Machine Translation with Knowledge-Distilled, Narrow, Inexpensive, Greedy, Hybrid Transformers

The Green KNIGHT: Green Machine Translation with Knowledge-Distilled, Narrow, Inexpensive, Greedy, Hybrid Transformers

ACL ARR 2025 May Submission6665 Authors

20 May 2025 (modified: 29 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: State-of-the-art neural machine translation (NMT) models deliver high-quality translations at the expense of large inference latency and energy consumption, requiring vast GPU fleets and contributing significantly to carbon emissions. To democratize and ``green'' NMT, we introduce the Green KNIGHT, a hardware-agnostic collection of recipes to optimize model performance in terms of speed and energy consumption, with only a minor trade-off in quality. On two high-resource benchmarks we show up to 91$\times$ CPU speedup and 94\% energy savings for En$\to$De, and 65$\times$ speedup and 10\% energy usage for En$\to$Ko; while incurring only minor losses of 9\% relative BLEU. Our results prove that efficient and environmentally conscious NMT can be realized through optimizations build on well-understood, off-the-shelf techniques with no custom low-level code required, making our approach immediately deployable in real-world translation pipelines.

Paper Type: Long

Research Area: Machine Translation

Research Area Keywords: efficient inference for MT, MT deployment and maintenance, scaling, modeling

Contribution Types: NLP engineering experiment, Approaches low compute settings-efficiency

Languages Studied: English, German, Korean

Submission Number: 6665

Loading