Decoupled Greedy Learning of Graph Neural Networks

YEWEN WANG; Jian Tang; Yizhou Sun; Guy Wolf

Decoupled Greedy Learning of Graph Neural Networks

YEWEN WANG, Jian Tang, Yizhou Sun, Guy Wolf

28 Sept 2020 (modified: 05 May 2023)ICLR 2021 Conference Blind SubmissionReaders: Everyone

Abstract: Graph Neural Networks (GNNs) become very popular for graph-related applications due to their superior performance. However, they have been shown to be computationally expensive in large scale settings, because their produced node embeddings have to be computed recursively, which scales exponentially with the number of layers. To address this issue, several sampling-based methods have recently been proposed to perform training on a subset of nodes while maintaining the fidelity of the trained model. In this work, we introduce a decoupled greedy learning method for GNNs (DGL-GNN) that, instead of sampling the input graph, decouples the GNN into smaller modules and associates each module with greedy auxiliary objectives. Our approach allows GNN layers to be updated during the training process without waiting for feedback from successor layers, thus making parallel GNN training possible. Our method achieves improved efficiency without significantly compromising model performances, which would be important for time or memory limited applications. Further, we propose a lazy-update scheme during training to further improve its efficiency. We empirically analyse our proposed DGL-GNN model, and demonstrate its effectiveness and superior efficiency through a range of experiments. Compared to the sampling-based acceleration, our model is more stable, and we do not have to trade-off between efficiency and accuracy. Finally, we note that while here we focus on comparing the decoupled approach as an alternative to other methods, it can also be regarded as complementary, for example, to sampling and other scalability-enhancing improvements of GNN training.

One-sentence Summary: We propose decoupled greedy learning method for GNNs (DGL-GNN) that improves training efficiency via parallelization over decoupled modules (e.g., layers) with greedy auxiliary objectives, without significantly compromising model performances.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Reviewed Version (pdf): https://openreview.net/references/pdf?id=ihJ4h4PwEA

7 Replies

Loading