AdaGL: Adaptive Learning for Agile Distributed Training of Gigantic GNNs

Ruisi Zhang, Mojan Javaheripi, Zahra Ghodsi, Amit Bleiweiss, Farinaz Koushanfar

Published: 2023, Last Modified: 11 Nov 2023DAC 2023Readers: Everyone

Abstract: Distributed GNN training on contemporary massive and densely connected graphs requires information aggregation from all neighboring nodes, which leads to an explosion of inter-server communications. This paper proposes AdaGL, a highly scalable end-to-end framework for rapid distributed GNN training. AdaGL novelty lies upon our adaptive-learning based graph-allocation engine as well as utilizing multi-resolution coarse representation of dense graphs. As a result, AdaGL achieves an unprecedented level of balanced server computation while minimizing the communication overhead. Extensive proof-of-concept evaluations on billion-scale graphs show AdaGL attains ∼30−40% faster convergence compared with prior arts.

0 Replies