PLHGMDA: Pre-trained Language Model and Heterogeneous Graph Neural Network for MiRNA-Disease Association Prediction
Abstract: MicroRNAs (miRNAs) play a crucial role in the pathogenesis and progression of various diseases. The precise identification of miRNA-disease associations (MDAs) holds significant implications for disease diagnosis and treatment. Although deep learning has achieved remarkable progress in MDA prediction, existing methods still face two critical challenges: (1) inadequate utilization of the intrinsic semantic information of biological entities, and (2) insufficient modeling of heterogeneous relationships in miRNA-disease networks. To address these limitations, this study proposes PLHGMDA—a novel prediction model that integrates fine-tuned pre-trained language models with heterogeneous graph neural networks. First, we employ pre-trained language models to encode node attributes in the constructed miRNA-disease heterogeneous graph, with parameter fine-tuning to adapt the model specifically for MDA prediction, thereby generating semantically enriched node embeddings for the heterogeneous graph. Subsequently, we apply a heterogeneous graph neural network incorporating multi-level attention mechanisms and multi-depth message passing mechanisms to derive miRNA and disease embeddings that integrate both rich semantic features and heterogeneous graph structural information. Finally, the concatenated embeddings of miRNAs and diseases are used to predict potential MDAs. Notably, to bridge the learning rate sensitivity difference between pre-trained language models and graph neural networks, we develop a progressive training strategy with temporally staggered parameter unfreezing. Experimental results show that PLHGMDA outperforms baseline methods. Cross-dataset validation further demonstrates PLHGMDA's efficacy in predicting both unseen disease associations and undocumented associations.
External IDs:dblp:conf/icic/WuYZWBZ25
Loading