FreshGNN: Reducing Memory Access via Stable Historical Embeddings for Graph Neural Network Training

Published: 18 Nov 2023, Last Modified: 30 Nov 2023LoG 2023 PosterEveryoneRevisionsBibTeX
Keywords: GNN; Performance; Data loading;
TL;DR: Historical embedding cache for GNN that identifies and reuses stable embedding to reduce data loading
Abstract: A key performance bottleneck when training graph neural network (GNN) models on large, real-world graphs is loading node features onto a GPU. Due to limited GPU memory, expensive data movement is necessary to facilitate the storage of these features on alternative devices with slower access (e.g. CPU memory). Moreover, the irregularity of graph structures contributes to poor data locality which further exacerbates the problem. Consequently, existing frameworks capable of efficiently training large GNN models usually incur a significant accuracy degradation because of the inevitable shortcuts involved. To address these limitations, we instead propose \system, a general-purpose GNN mini-batch training framework that leverages a historical cache for storing and reusing GNN node embeddings instead of re-computing them through fetching raw features at every iteration. Critical to its success, the corresponding cache policy is designed, using a combination of gradient-based and staleness criteria, to selectively screen those embeddings which are relatively stable and can be cached, from those that need to be re-computed to reduce estimation errors and subsequent downstream accuracy loss. When paired with complementary system enhancements to support this selective historical cache, \system is able to accelerate the training speed on large graph datasets such as ogbn-papers100M and MAG240M by 2.7$\times$ up to 20.5$\times$ and reduce the memory access by 64.5\% (85.7\% higher than a raw feature cache), with less than 1\% influence on test accuracy.
Submission Type: Extended abstract (max 4 main pages).
Agreement: Check this if you are okay with being contacted to participate in an anonymous survey.
Poster: png
Poster Preview: png
Submission Number: 124
Loading