- Abstract: Although deep learning has enabled unprecedented improvements in the performance of the state-of-the-art speech emotion recognition (SER) systems, recent research on adversarial examples has cast a shadow of doubt on the robustness of SER systems by showing the susceptibility of deep neural networks to adversarial examples that rely only on small and imperceptible perturbations. In this study, we evaluate how adversarial examples can be used to attack SER systems and propose the first black-box adversarial attack on SER systems. We also explore potential defenses including adversarial training and generative adversarial network (GAN) to enhance robustness. Experimental evaluations suggest various interesting aspects of the effective utilization of adversarial examples that can be useful not only for SER robustness but also other speech-based intelligent systems.
- Keywords: deep neural networks, adversarial examples, speech emotion recognition