Building High-Accuracy Multilingual ASR With Gated Language Experts and Curriculum Training

Eric Sun, Jinyu Li, Yuxuan Hu, Yimeng Zhu, Long Zhou, Jian Xue, Peidong Wang, Linquan Liu, Shujie Liu, Edward Lin, Yifan Gong

Published: 2023, Last Modified: 28 Dec 2025ASRU 2023EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: We propose gated language experts and curriculum training to enhance multilingual transformer transducer models without requiring user input for language identification (LID) during inference. Our method incorporates a gating mechanism and LID loss to enable transformer experts to learn language-specific information. Linear experts are applied on joint network to stabilize training. The curriculum training scheme leverages LID to guide gated experts in improving their respective language-specific performance. Experimental results on an English and Spanish bilingual task show significant average relative word error reductions of 12.5 % and 7.3 % compared to the baseline bilingual and monolingual models, respectively. Our models even perform similarly to upper-bound models with oracle LID. Extending our approach to trilingual, quadrilingual, and pentalingual models reveals similar advantages to those seen in the bilingual models, highlighting its ease of extension to multiple languages.