Whose emotion matters? Speaking activity localisation without prior knowledge

Hugo C. C. Carneiro, Cornelius Weber, Stefan Wermter

Published: 2023, Last Modified: 20 May 2025Neurocomputing 2023EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Highlights•Extraction of reliable acoustic data from originally misaligned videos.•Extraction of face images without prior annotation of the speaker’s location.•Reliable audiovisual data provision for emotion recognition in multiparty dialogues.•Dataset refinement improves audiovisual data used for emotion recognition.•Refinement of the MELD dataset using CTC segmentation and active speaker detection.