Multimodal Models for Contextual Affect Assessment in Real-Time

Jordan Vice, Masood Mehmood Khan, Svetlana N. Yanushkevich

Published: 01 Jan 2019, Last Modified: 18 Nov 2023CogMI 2019Readers: Everyone

Abstract: Most affect classification schemes rely on near accurate single-cue models resulting in less than required accuracy under certain peculiar conditions. We investigate how the holism of a multimodal solution could be exploited for affect classification. This paper presents the design and implementation of a prototype, stand-alone, real-time multimodal affective state classification system. The presented system utilizes speech and facial muscle movements to create a holistic classifier. The system combines a facial expression classifier and a speech classifier that analyses speech through paralanguage and propositional content. The proposed classification scheme includes a Support Vector Machine (SVM) - paralanguage; a K-Nearest Neighbor (KNN) - propositional content and an InceptionV3 neural network - facial expressions of affective states. The SVM and Inception models boasted respective validation accuracies of 99.2% and 92.78%.

0 Replies