Multimodal Physiological Signals Representation Learning via Multiscale Contrasting for Depression Recognition

Published: 20 Jul 2024, Last Modified: 21 Jul 2024MM2024 PosterEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Depression recognition based on physiological signals such as functional near-infrared spectroscopy (fNIRS) and electroencephalogram (EEG) has made considerable progress. However, most existing studies ignore the complementarity and semantic consistency of multimodal physiological signals under the same stimulation task in complex spatio-temporal patterns. In this paper, we introduce a multimodal physiological signals representation learning framework using Siamese architecture via multiscale contrasting for depression recognition (MRLMC). First, fNIRS and EEG are transformed into different but correlated data based on a time-domain data augmentation strategy. Then, we design a spatio-temporal contrasting module to learn the representation of fNIRS and EEG through weight-sharing multiscale spatio-temporal convolution. Furthermore, to enhance the learning of semantic representation associated with stimulation tasks, a semantic consistency contrast module is proposed, aiming to maximize the semantic similarity of fNIRS and EEG. Extensive experiments on publicly available and self-collected multimodal physiological signals datasets indicate that MRLMC outperforms the state-of-the-art models. Moreover, our proposed framework is capable of transferring to multimodal time series downstream tasks. We will release the code and weights after review.
Primary Subject Area: [Engagement] Emotional and Social Signals
Relevance To Conference: This paper focuses on the field of Emotional and Social Signals, mainly studies multimodal physiological signals for the diagnosis of mental illness, and proposes a multimodal physiological signals representation learning framework based on Siamese architecture via multiscale contrasting for depression recognition. This method can significantly improve the accuracy of multimodal physiological signals for depression recognition, and has achieved excellent performance on public and self-collected datasets. As the top conference in the field of multimodality, ACM MM conference has been committed to promoting the development and innovation of multimodal technology. The research content of this paper is highly consistent with the theme of the conference, which not only helps to promote the progress in the field of multimodal data analysis, but also provides new research ideas and perspectives for participants. In addition, we believe that the publication of this paper will stimulate more interest and attention of researchers to multimodal physiological signal processing technology and further promote the development of multimodal field.
Submission Number: 4971
Loading