- Abstract: Recent works revealed that state-of-the-art machine learning based Automatic Speech Recognition systems (ASR) have a considerable vulnerability to the crafted adversarial examples. However, limited by individual ASR system's specific machine learning models, the current audio adversarial attacks still lack certain model transferability as well as configurability for different deployment scenarios. In this work, we propose a novel untargeted adversarial example generation method to ASR systems, which shifts the adversarial example generation from the high-level machine learning models to the low-level feature extraction stage. By taking advantage of the fundamental impact and direct configuration of the low-level features, the proposed method can generate transferable and configurable adversarial examples for ASR system perturbation. During the evaluation, we use 6 commercial ASR models to test the proposed attack method. The results show that the proposed method can achieve strong transferability and good perturbation effectiveness. Also, it can configure the adversarial examples with desired audio attributes for better scenario adaptation capability.
- Keywords: Automatic Speech Recognition, Adversarial Example, Transferable, Black-box, Reconfiguration