Human-like Supramodal Concept Learning Boosts Emotion Recognition

Han Lu; Peixing Xie; Qiang Luo

Human-like Supramodal Concept Learning Boosts Emotion Recognition

Han Lu, Peixing Xie, Qiang Luo

03 Sept 2025 (modified: 19 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Supramodal emotion concept; Human behavior; Replay strategy

TL;DR: AI application inspired by neuroscience knowledge

Abstract: Multimodal emotion recognition has shown promise but is often hindered by the complexity of integrating heterogeneous sensory inputs. Intriguingly, the human brain addresses this challenge through abstract, modality-independent emotion schemas, known as supramodal emotion concepts, which are learned gradually from emotional experiences across different sensory modalities. Here, we propose a learning strategy to construct supramodal emotion concepts across vision, text, and audio. Each modality’s data repeatedly passes through a shared emotion encoder and its corresponding modality-specific non-emotion encoder in a decoupling framework, extracting modality-independent emotion representations. Inspired by hippocampal replay in humans, these representations are aggregated from a memory pool during downstream emotion recognition to form supramodal emotion concepts. We demonstrate the effectiveness of this approach in multiple settings:(1) a lightweight image-based model achieves state-of-the-art results on several benchmark datasets with lower complexity than existing unimodal methods; (2) unimodal models using vision, text, or audio from video clips achieve performance comparable to multimodal models; and (3) concept-guided multimodal models further improve performance, surpassing current state-of-the-art.

Primary Area: applications to neuroscience & cognitive science

Submission Number: 1636

Loading