Abstract: Cancer development is closely linked to the accumulation of mutations in driver genes. Therefore, identification of driver genes is crucial for understanding the molecular basis of cancer. In various types of methods, approches based on Graph Neural Networks (GNNs) framework are one of the most effective tools to identify driver genes, which can also combine biological networks and multi-omics data to further improve the identification accuracy. However, many GNN frameworks often utilize single-order moment to get neighbourhood information for message passing, which ignores the rich distribution information and gene features of neighbouring genes. Besides, when using GNNs model, it is necessary to stack the hidden layers, which can often lead to both over-smoothing and network degradation problems. To overcome these issues, a new framework called DMGNN was proposed for identifying driver genes. To get rich distribution information and gene features, a mix-moment embedding and attention-based feature selection were utilized among neighbouring genes. To solve the problem posed by hidden layer stacking in the GNN model, a deepwalk method was used to learn remote gene effects for the over-smoothing problem, and the resNet-based hidden layer aggregation was employed to mitigate the network degradation issue. Experimental results demonstrate that the proposed model outperforms many existing methods for identifying cancer driver genes, where the AUROC value was achieved at 0.856 in STRING dataset with at least 2 percentage improvement comparing with other models. The DMGNN is freely available via https://github.com/lavendar682/DMGNN.
Loading