Abstract: Speaker recognition systems (SRS) play a vital role in identity authentication. At the same time, researchers have found that these systems are highly vulnerable to backdoor attacks, where the poisoned model will misclassify poisoned inputs. Most backdoor attack methods primarily focus on improving attack success rates (ASR), achieving ASR as high as 99%. However, these methods reveal a significant concern in terms of stealthiness. Poisoned audio often exhibits detectable differences from the clean audio, which can be detected by human listeners or through visualization. To overcome this issue, we prioritize stealthiness in our attack design and propose StealthPhase. Motivated by preliminary experiments on frequency-domain random noise backdoor attacks, our method implants a predefined trigger into the phase spectrum through frequency decomposition to ensure inherent stealth. The predefined trigger uses the natural phase pattern derived from real speech. Therefore, it is both learnable, as it addresses the challenge of designing effective phase-based triggers, and stealthy, as it remains imperceptible in both spectrogram visualizations and auditory perception. A key advantage of our method is that it avoids complex algorithms to optimize triggers and does not require an extra loss function to balance stealthiness and effectiveness. Extensive experimental results demonstrate that StealthPhase achieves 99% ASR with minimal impact on the model’s benign accuracy (BA). Meanwhile, its stealthiness is validated from three perspectives. First, visualizations show that the backdoor audio samples are nearly indistinguishable from clean samples. Second, an audio quality assessment confirms that the trigger introduces minimal perceptual distortion, preserving the overall audio quality. Finally, speech recognition performance evaluation shows that the word error rate (WER) remains largely unaffected. Furthermore, we validate the effectiveness of StealthPhase in real-world scenarios, where it achieves an ASR of 80%, and demonstrate its ability to bypass defense mechanisms.
External IDs:doi:10.1109/tifs.2025.3642543
Loading