Abstract: Graph neural networks (GNNs) learn to represent nodes by aggregating information from
their neighbors. As GNNs increase in depth, their receptive field grows exponentially, leading
to high memory costs. Several works in the literature proposed to address this shortcoming by
sampling subgraphs, or by using historical embeddings. These methods have mostly focused
on benchmarks of single-label node classification on homophilous graphs, where neighboring
nodes often share the same label. However, most of these methods rely on static heuristics
that may not generalize across different graphs or tasks. We argue that the sampling method
should be adaptive, adjusting to the complex structural properties of each graph. To this
end, we introduce GRAPES, an adaptive sampling method that learns to identify the set of
nodes crucial for training a GNN. GRAPES trains a second GNN to predict node sampling
probabilities by optimizing the downstream task objective. We evaluate GRAPES on various
node classification benchmarks involving homophilous as well as heterophilous graphs. We
demonstrate GRAPES’ effectiveness in accuracy and scalability, particularly in multi-label
heterophilous graphs. Additionally, GRAPES uses orders of magnitude less GPU memory
than a strong baseline based on historical embeddings. Unlike other sampling methods,
GRAPES maintains high accuracy even with smaller sample sizes and, therefore, can scale
to massive graphs. Our implementation is publicly available online.
Submission Length: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Wenbing_Huang1
Submission Number: 3923
Loading