RTGA: A Redundancy-free Accelerator for High-Performance Temporal Graph Neural Network Inference

Published: 01 Jan 2024, Last Modified: 18 May 2025DAC 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Temporal Graph Neural Network (TGNN) has attracted much research attention because it can capture the dynamic nature of complex networks. However, existing solutions suffer from redundant computation overhead and excessive off-chip communications for TGNN inference because they often rely on redundant graph sampling and repeatedly fetching the features and vertex memory. This paper proposes a redundancy-free accelerator, RTGA, for high-performance TGNN inference. Specifically, RTGA proposes a redundancy-aware execution approach with temporal tree into a novel accelerator design to effectively eliminate unnecessary data processing for fewer redundant computations and off-chip communications and also designs a temporal-aware data caching method to improve data locality for TGNN. We have implemented and evaluated RTGA on a Xilinx Alveo U280 FPGA card. Compared with cutting-edge software solutions (i.e., TGN and TGL) and hardware solutions (i.e., BlockGNN and FlowGNN), RTGA improves the performance of TGNN inference by an average of 473.2x, 87.4x, 8.2x, and 6.9x and saves energy by 542.8x, 102.2x, 9.4x, and 8.3x, respectively.
Loading