Faster Machine Translation Ensembling with Reinforcement Learning and Competitive Correction

Kritarth Prasad, Mohammadi Zaki, Pratik Rakesh Singh, Pankaj Wasnik

Published: 13 Mar 2025, Last Modified: 13 Mar 2025NAACL 2025EveryoneCC BY 4.0

Abstract: Ensembling neural machine translation (NMT) models to produce higher-quality translations than the Lindividual models has been exten- sively studied. Recent methods typically em- ploy a candidate selection block (CSB) and an encoder-decoder fusion block (FB), requiring inference across all candidate models, leading to significant computational overhead, gener- ally Ω(L). This paper introduces SmartGen, a reinforcement learning (RL)-based strategy that improves the CSB by selecting a small, fixed number of candidates and identifying op- timal groups to pass to the fusion block for each input sentence. Furthermore, previously, the CSB and FB were trained independently, leading to suboptimal NMT performance. Our DQN-based SmartGen addresses this by using feedback from the FB block as a reward during training. We also resolve a key issue in ear- lier methods, where candidates were passed to the FB without modification, by introducing a Competitive Correction Block (CCB). Finally, we validate our approach with extensive exper- iments on English-Hindi translation tasks in both directions as well as English to Chinese and English to German.