Abstract: Emotion recognition remains a challenging yet essential task in affective computing, spanning fields from psychology to human-computer interaction. This study introduces a novel approach to improve emotion recognition by integrating multimodal physiological signal interaction networks with graph neural networks. We explored five undirected functional connectivity methods for constructing physiologic networks: Pearson correlation coefficient, maximal information coefficient, phase-locking value, phase lag index, and time-delay stability (TDS). These methods capture the relationships between the featured waveforms from electroencephalography and peripheral signals (electrocardiography, respiration, and skin conductance). The resulting physiologic networks, combined with extracted waveform features, were fed into graph attention networks (GATs) and graph isomorphism networks (GINs) for emotion classification. Our model was trained on the DEAP dataset and tested on the MAHNOB-HCI dataset to evaluate its generalizability. The TDS-based GAT and GIN models demonstrated superior performance in recognizing arousal and valence states compared with the traditional classifiers like support vector machines, convolutional neural networks, and standard graph convolutional neural networks. Specifically, the proposed method achieved outstanding $F1$ scores of 83.38% for arousal and 82.52% for valence on cross-dataset emotion recognition. These results underscore the importance of incorporating dynamic signal coupling and multimodal physiological data to improve emotion recognition accuracy and robustness across different datasets, highlighting the potential of this approach for practical applications.
External IDs:dblp:journals/tamd/CaiGWLL25
Loading