Building High-Accuracy Multilingual ASR With Gated Language Experts and Curriculum Training

Published: 01 Jan 2023, Last Modified: 09 Apr 2025ASRU 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: We propose gated language experts and curriculum training to enhance multilingual transformer transducer models without requiring user input for language identification (LID) during inference. Our method incorporates a gating mechanism and LID loss to enable transformer experts to learn language-specific information. Linear experts are applied on joint network to stabilize training. The curriculum training scheme leverages LID to guide gated experts in improving their respective language-specific performance. Experimental results on an English and Spanish bilingual task show significant average relative word error reductions of 12.5 % and 7.3 % compared to the baseline bilingual and monolingual models, respectively. Our models even perform similarly to upper-bound models with oracle LID. Extending our approach to trilingual, quadrilingual, and pentalingual models reveals similar advantages to those seen in the bilingual models, highlighting its ease of extension to multiple languages.
Loading