A Deep Ensemble Approach of Anger Detection from Audio-Textual Conversations

Published: 01 Jan 2022, Last Modified: 18 May 2025ACII 2022EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Anger detection from conversations has a number of real-life applications that include improving interpersonal communications, providing customer services, and enhancing work-place performance. In this paper, we propose novel deep learning-based approaches for both offline and online anger detection from audio-textual data obtained from real-life conversations. For offline anger detection, which detects the anger of a given audio-textual conversation, we introduce an ensemble approach that adapts attention-based CNN architecture, gender classifier, and BERT-based textual features to derive the anger of a conversion. On the other hand, for online anger detection, which predicts anger in the conversation of the subsequent timestamps from the conversations of previous timestamps, we propose a transformer-based audio and textual ensemble technique to predict the anger of a future conversation. We demonstrate the efficacy of our proposed approaches using two datasets: the Bengali call-center dataset and the IEMOCAP dataset. Experimental results show that our proposed approaches outperform the state-of-the-art baselines by a significant margin. For offline anger, our model achieves an F1 score of 85.5% on the Bengali dataset and 91.4% on the IEMOCAP dataset. For online anger, our model yields an F1 score of 66.9% on the Bengali dataset and 67.7% on the IEMOCAP dataset.
Loading