Scalable and Effective Graph Neural Networks via Trainable Random Walk Sampling

Published: 01 Jan 2025, Last Modified: 14 Mar 2025IEEE Trans. Knowl. Data Eng. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Graph Neural Networks (GNNs) have aroused increasing research attention for their effectiveness on graph mining tasks. However, full-batch training methods based on stochastic gradient descent (SGD) require substantial resources since all gradient-required computational processes are stored in the acceleration device. The bottleneck of storage challenges the training of classic GNNs on large-scale datasets within one acceleration device. Meanwhile, message-passing based (spatial) GNN designs usually necessitate the homophily hypothesis of the graph, which easily fails on heterophilous graphs. In this paper, we propose the random walk extension for those message-passing based GNNs, enriching them with spectral powers. We prove that our random walk sampling with appropriate correction coefficients generates an unbiased approximation of the $K$-order polynomial filter matrix, thus promoting the neighborhood aggregation of the central nodes. Node-wise sampling strategy and historical embedding allow the classic models to be trained with mini-batches, which extends the scalability of the basic models. To show the effectiveness of our method, we conduct a thorough experimental analysis on some frequently-used benchmarks with diverse homophily and scale. The empirical results show that our model achieves significant performance improvements in comparison with the corresponding base GNNs and some state-of-the-art baselines in node classification tasks.
Loading