CDA-GNN: A Chain-driven Accelerator for Efficient Asynchronous Graph Neural Network

Hui Yu, Yu Zhang, Ligang He, Donghao He, Qikun Li, Jin Zhao, Xiaofei Liao, Hai Jin, Lin Gu, Haikun Liu

Published: 01 Jan 2024, Last Modified: 18 May 2025DAC 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Asynchronous Graph Neural Network (AGNN) has attracted much research attention because it enables faster convergence speed than the synchronous GNN. However, existing software/hardware solutions suffer from redundant computation overhead and excessive off-chip communications for AGNN due to irregular state propagations along the dependency chains between vertices. This paper proposes a chain-driven asynchronous accelerator, CDA-GNN, for efficient AGNN inference. Specifically, CDA-GNN proposes a chain-driven asynchronous execution approach into novel accelerator design to regularize the vertex state propagations for fewer redundant computations and off-chip communications and also designs a chain-aware data caching method to improve data locality for AGNN. We have implemented and evaluated CDA-GNN on a Xilinx Alveo U280 FPGA card. Compared with the cutting-edge software solutions (i.e., Dorylus and AMP) and hardware solutions (i.e., BlockGNN and FlowGNN), CDA-GNN improves the performance of AGNN inference by an average of 1,173x, 182.4x, 10.2x, and 7.9x and saves energy by 2,241x, 242.2x, 12.4x, and 8.9x, respectively.