Abstract: Deep neural network (DNNs) based Automatic Speech Recognition (ASR) systems are known vulnerable to adversarial attacks that are maliciously implemented by adding small but powerful distortions to the original audio input. However, most existing methods that generate audio adversarial examples targeting ASR models cannot achieve successful robust attacks against defense methods. This paper proposes a novel framework for robust audio patch attacks using Physical Sample Simulation (PSS) and Adversarial Patch Noise Generation (APNG). First, the proposed PSS simulated real-audio with selected room impulse response for training the adversarial patches. Second, the proposed APNG generates the imperceptible audio adversarial patch examples using the voice activity detector to hide the adversarial patch noise into the non-silent locations of the input audio. Furthermore, the design Sounds Pressure Level-based adaptive noise minimization algorithm helps us further reduce the perturbation during the attack. The experimental results show that our proposed method can achieve the highest attack success rates and SNRs in various cases, comparing with other state-of-the-art attacks.
0 Replies
Loading