Abstract: This paper addresses the critical challenge of distribution shifts in automatic speech recognition (ASR) systems through a novel framework, named adaptation without mode collapse (AWMC). Distribution shifts issue, where the source and target distributions differ, can severely degrade the performance of ASR systems. The proposed AWMC framework, a response to the issue, is designed to facilitate adaptive learning from sequentially streamed utterances and mitigates the effect of mode collapse, which is a common problem with traditional test-time adaptation (TTA) methodologies. Our framework employs three parameter-shared models (anchor, chaser, and leader) in concert to continually adapt from target data, significantly enhancing the performance and reliability of the system. The effectiveness of the AWMC is demonstrated through comprehensive performance comparisons with state-of-the-art TTA methods using widely recognized ASR datasets.
Loading