Abstract: Electroencephalography (EEG) signals and eye movement signals, which represent internal physiological responses and external subconscious behaviors, respectively, have been shown to be reliable indicators for recognizing emotions. However, integrating these two modalities across multiple subjects presents several challenges: 1) designing a robust consistency metric that balances the consistency and divergences between heterogeneous modalities across multiple subjects; 2) simultaneously considering intra-modality and inter-modality information across multiple subjects; and 3) overcoming individual differences among multiple subjects and generating subject-invariant representations of the multimodal fused features. To address these challenges associated with multisource data (i.e., multiple modalities and subjects), we propose a novel comprehensive multisource learning network (CMSLNet) for cross-subject multimodal emotion recognition. Specifically, an instance-level adaptive robust consistency metric is first designed to better align the information between EEG signals and eye movement signals, identifying their consistency and divergences across various emotions. Subsequently, an attentive low-rank multimodal fusion (Att-LMF) method is developed to account for individual differences and dynamically learn intra-modality and inter-modality information, resulting in highly discriminative fused features. Finally, domain generalization is utilized to extract subject-invariant representations of the fused features, thus adapting to new subjects and enhancing the model's generalization. Through these elaborate designs, CMSLNet effectively incorporates the information from multisource data, thus significantly improving the accuracy and reliability of emotion recognition. Extensive experiments on two public datasets demonstrate the superior performance of CMSLNet. CMSLNet achieves high accuracies of 83.15% on the SEED-IV dataset and 87.32% on the SEED-V dataset, surpassing the state-of-the-art methods by 3.62% and 4.60%, respectively.
Loading