Multi-level graph self-supervised learning for multi-modal medical corpus construction

Yuping Lin, Jingxi Feng, Xudong Chen, Rundong Xue, Jue Jiang, Zhiqiang Tian, Juan Wang

Published: 01 Jan 2026, Last Modified: 04 Sept 2025Pattern Recognit. 2026EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Multi-modal medical corpus, as a novel tool for computer-aided medical diagnosis and learning research, hold significant value in exploring pathogenic mechanisms. However, in the field of neuroscience, brain imaging often lacks semantic labels, making the construction of multi-modal medical corpus challenging. Functional magnetic resonance imaging (fMRI), a commonly used brain imaging, can be employed to study functional brain networks and perform classification to obtain labels. Moreover, due to limited data samples, developing graph foundation model for brain network classification is important. We propose a multi-level graph self-supervised learning (MLGSL) method, which performs multi-level pretraining tasks to uncover disease-related correlation patterns. This graph foundation model is then used to classify brain networks to obtain semantic labels. Additionally, by integrating the potential biomarkers with relevant textual matched through large language model, a multi-modal medical corpus can be constructed. Specifically, MLGSL first conducts a functional brain network link prediction task at the individual level, and a population-level link prediction task on the population-associated network. In the encoder part of pretraining task, the proposed multi-channel enhanced attention graph convolution strengthens the attention mechanism via connection strength, while integrating visible node representations learned from different perspectives. Subsequently, MLGSL performs category prediction of functional brain networks through fine-tuning. Experiments demonstrate that MLGSL achieves optimal brain network classification performance, thereby enabling the acquisition of accurate semantic labels. By combining brain imaging data with key textual information, a multi-modal medical corpus can be constructed, providing important value for interpretation of pathological information and clinical diagnosis of diseases.

External IDs:dblp:journals/pr/LinFCXJTW26