Abstract: Graph Neural Networks (GNNs) have shown success in learning from graph-structured data, with applications
to fraud detection, recommendation, and knowledge graph reasoning. However, training GNN efficiently is
challenging because: 1) GPU memory capacity is limited and can be insufficient for large datasets, and 2) the
graph-based data structure causes irregular data access patterns. In this work, we provide a method to statistical
analyze and identify more frequently accessed data ahead of GNN training. Our data tiering method not only
utilizes the structure of input graph, but also an insight gained from actual GNN training process to achieve a
higher prediction result. With our data tiering method, we additionally provide a new data placement and access
strategy to further minimize the CPU-GPU communication overhead. We also take into account of multi-GPU
GNN training as well and we demonstrate the effectiveness of our strategy in a multi-GPU system. The evaluation
results show that our work reduces CPU-GPU traffic by 87–95% and improves the training speed of GNN over
the existing solutions by 1.6–2.1× on graphs with hundreds of millions of nodes and billions of edges.
0 Replies
Loading