Abstract: Pseudo-labeling based semi-supervised learning can mitigate the performance degradation resulting from the absence of labeled data in the target domain. In pseudo-labeling, the quality of pseudo-labels is crucial for the final performance. However, most works overlook the potential benefits of using decoder for pseudo-labels within the the mainstream hybrid Connectionist Temporal Classification (CTC) and attention (CTC/attention) based ASR architecture. Therefore, we propose Hybrid Pseudo-Labeling (HPL) to improve the quality of pseudo-labels during online decoding. HPL introduces a second-stage decoding using the decoder to alleviate substitution errors arising from the conditional independence assumption inherent in CTC for error correction. Furthermore, we propose Hybrid Selection to optimally combine results of encoder and decoder. Additionally, we introduce Speed Perturbation Enhancement (SPE) to further enhance the quality of pseudo-labels via speed perturbation. Experiments demonstrate that HPL achieves state-of-the-art performance compared to other mainstream pseudo-labeling methods.
External IDs:dblp:conf/icassp/Zheng0YWCL25
Loading