REmoNet: Reducing Emotional Label Noise via Multi-regularized Self-supervision

Weibang Jiang; Yu-Ting Lan; Bao-liang Lu

REmoNet: Reducing Emotional Label Noise via Multi-regularized Self-supervision

Weibang Jiang, Yu-Ting Lan, Bao-liang Lu

Published: 20 Jul 2024, Last Modified: 29 Jul 2024MM2024 PosterEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Emotion recognition based on electroencephalogram (EEG) has garnered increasing attention in recent years due to the non-invasiveness and high reliability of EEG measurements. Despite the promising performance achieved by numerous existing methods, several challenges persist. Firstly, there is the challenge of emotional label noise, stemming from the assumption that emotions remain consistently evoked and stable throughout the entirety of video observation. Such an assumption proves difficult to uphold in practical experimental settings, leading to discrepancies between EEG signals and anticipated emotional states. In addition, there's the need for comprehensive capture of the temporal-spatial-spectral characteristics of EEG signals and cope with low signal-to-noise ratio (SNR) issues. To tackle these challenges, we propose a comprehensive pipeline named REmoNet, which leverages novel self-supervised techniques and multi-regularized co-learning. Two self-supervised methods, including masked channel modeling via temporal-spectral transformation and emotion contrastive learning, are introduced to facilitate the comprehensive understanding and extraction of emotion-relevant EEG representations during pre-training. Additionally, fine-tuning with multi-regularized co-learning exploits feature-dependent information through intrinsic similarity, resulting in mitigating emotional label noise. Experimental evaluations on two public datasets demonstrate that our proposed approach, REmoNet, surpasses existing state-of-the-art methods, showcasing its effectiveness in simultaneously addressing raw EEG signals and noisy emotional labels.

Primary Subject Area: [Engagement] Emotional and Social Signals

Secondary Subject Area: [Experience] Interactions and Quality of Experience

Relevance To Conference: 1. The work focuses on EEG-based emotion recognition, integrating multimedia data involving EEG signals induced by visual stimuli videos. This aligns with ACM Multimedia’s interest in multimedia data processing. 2. The proposed method utilizes two self-supervised learning techniques, which is an active area of research in ACM Multimedia. The work applies two techniques to EEG emotion recognition, addressing the scarcity of labeled data in this domain. 3. The method processes EEG signals, incorporating both temporal and spatial information. This aligns with ACM Multimedia’s focus on multimedia signal processing. 4. The work addresses the issue of emotional label noise, which is common in multimedia datasets. The proposed multi-regularized co-learning approach is highly relevant to ACM Multimedia, as it addresses the robustness issue of multimedia data labeling.

Submission Number: 3798

Loading