Episodic Memory For Domain-Adaptable, Robust Speech Emotion Recognition

James Tavernor, Matthew Perez, Emily Mower Provost

Published: 2023, Last Modified: 18 May 2025INTERSPEECH 2023EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Emotion conveys abundant information that can improve the user experience of various automated systems, in addition to communicating information important for managing well-being. Human speech conveys emotion, but speech emotion recognition models do not perform well in unseen environments. This limits the ubiquitous use of speech emotion recognition models. In this paper, we investigate how a model can be adapted to unseen environments without forgetting previously learned environments. We show that memory-based methods maintain performance on previously seen environments while still being able to adapt to new environments. These methods enable continual training of speech emotion recognition models following deployment while retaining previous knowledge, working towards a more general, adaptable, acoustic model.