GRAPES: Learning to Sample Graphs for Scalable Graph Neural Networks

TMLR Paper3923 Authors

09 Jan 2025 (modified: 16 Apr 2025)Decision pending for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Graph neural networks (GNNs) learn to represent nodes by aggregating information from their neighbors. As GNNs increase in depth, their receptive field grows exponentially, leading to high memory costs. Several works in the literature proposed to address this shortcoming by sampling subgraphs, or by using historical embeddings. These methods have mostly focused on benchmarks of single-label node classification on homophilous graphs, where neighboring nodes often share the same label. However, most of these methods rely on static heuristics that may not generalize across different graphs or tasks. We argue that the sampling method should be adaptive, adjusting to the complex structural properties of each graph. To this end, we introduce GRAPES, an adaptive sampling method that learns to identify the set of nodes crucial for training a GNN. GRAPES trains a second GNN to predict node sampling probabilities by optimizing the downstream task objective. We evaluate GRAPES on various node classification benchmarks involving homophilous as well as heterophilous graphs. We demonstrate GRAPES’ effectiveness in accuracy and scalability, particularly in multi-label heterophilous graphs. Additionally, GRAPES uses orders of magnitude less GPU memory than a strong baseline based on historical embeddings. Unlike other sampling methods, GRAPES maintains high accuracy even with smaller sample sizes and, therefore, can scale to massive graphs. Our implementation is publicly available online.
Submission Length: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Wenbing_Huang1
Submission Number: 3923
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview