Abstract: Heterogeneous graph neural networks (HGNNs) excel at understanding heterogeneous information networks (HINs) and have demonstrated state-of-the-art performance across numerous tasks. However, previous works tend to study small datasets, which deviate significantly from real-world scenarios. More specifically, their heterogeneous message passing results in substantial memory and time overheads, as it requires aggregating heterogeneous neighbor features multiple times. To address this, we propose an Efficient Heterogeneous Graph Neural Network (EHGNN) that leverages heterogeneous personalized PageRank (HPPR) to preserve the influence between all nodes, then approximates message passing and selectively loads neighbor information for one aggregation, significantly reducing memory and time usage. In addition, we employ some lightweight techniques to ensure the performance of EHGNN. Evaluations on various HIN benchmarks in node classification and link prediction tasks unequivocally establish the superiority of EHGNN, surpassing the State-of-the-Art by 11$\%$ in terms of performance. In addition, EHGNN achieves a remarkable 400$\%$ boost in training and inference speed while utilizing less memory. Notably, EHGNN can handle a 200-million-node, 1-billion-link HIN within 18 hours on a single machine, using only 170 GB of memory, which is much lower than the previous minimum requirement of 600 GB.
Loading