Adversarial auto-encoders for speech based emotion recognition

Saurabh Sahu, Rahul Gupta, Ganesh Sivaraman, Wael AbdAlmageed, Carol Espy-Wilson

10 Jan 2023OpenReview Archive Direct UploadReaders: Everyone

Abstract: Recently, generative adversarial networks and adversarial auto- encoders have gained a lot of attention in machine learning community due to their exceptional performance in tasks such as digit classification and face recognition. They map the auto- encoder’s bottleneck layer output (termed as code vectors) to different noise Probability Distribution Functions (PDFs), that can be further regularized to cluster based on class informa- tion. In addition, they also allow a generation of synthetic sam- ples by sampling the code vectors from the mapped PDFs. In- spired by these properties, we investigate the application of ad- versarial auto-encoders to the domain of emotion recognition. Specifically, we conduct experiments on the following two as- pects: (i) their ability to encode high dimensional feature vec- tor representations for emotional utterances into a compressed space (with a minimal loss of emotion class discriminability in the compressed space), and (ii) their ability to regenerate syn- thetic samples in the original feature space, to be later used for purposes such as training emotion recognition classifiers. We demonstrate promise of adversarial auto-encoders with regards to these aspects on the Interactive Emotional Dyadic Motion Capture (IEMOCAP) corpus and present our analysis.

0 Replies