Abstract: This paper addresses the common challenge of system performance degradation due to speech inconsistency and mismatched acoustic conditions across various domains in speaker verification tasks. We propose a Noise-Aware Quality Network designed to estimate a score based on speech quality and the presence of speech obscured by noise in real-world environments. The score, derived from the normalization of estimated speech quality evaluations, is incorporated into a proposed Noise-Aware Quality loss function, aiming to prioritize speech quality by weighting the embedding distances based on the quality score. Our methodology significantly improves speaker verification performance, particularly in noisy environments. Furthermore, our work highlights the importance of speech quality and the potential benefits of incorporating speech quality weight into the loss function for speaker verification tasks.
Loading