Fast Training of Sparse Graph Neural Networks on Dense Hardware

Matej Balog; Bart van Merriënboer; Subhodeep Moitra; Yujia Li; Daniel Tarlow

Fast Training of Sparse Graph Neural Networks on Dense Hardware

Matej Balog, Bart van Merriënboer, Subhodeep Moitra, Yujia Li, Daniel Tarlow

25 Sept 2019 (modified: 05 May 2023)ICLR 2020 Conference Blind SubmissionReaders: Everyone

TL;DR: Is sparse hardware necessary for training sparse GNNs? No. Does large-batch training work for sparse GNNs? Yes. So what? We can train a model in 13 minutes that previously took almost a day.

Abstract: Graph neural networks have become increasingly popular in recent years due to their ability to naturally encode relational input data and their ability to operate on large graphs by using a sparse representation of graph adjacency matrices. As we look to scale up these models using custom hardware, a natural assumption would be that we need hardware tailored to sparse operations and/or dynamic control flow. In this work, we question this assumption by scaling up sparse graph neural networks using a platform targeted at dense computation on fixed-size data. Drawing inspiration from optimization of numerical algorithms on sparse matrices, we develop techniques that enable training the sparse graph neural network model from Allamanis et al. (2018) in 13 minutes using a 512-core TPUv2 Pod, whereas the original training takes almost a day.

Code: https://github.com/anonymous-authors-iclr2020/fast_training_of_sparse_graph_neural_networks_on_dense_hardware/blob/master/code.ipynb

Original Pdf: pdf

16 Replies

Loading