Abstract: Graph Neural Network (GNN) has shown great success in graph learning, including physics systems, protein interfaces, disease classification, molecular fingerprints, etc. Due to the complexity of the real-world tasks and the big graph datasets, current GNN models become increasingly bigger and more complicated to enhance the learning ability and prediction accuracy. In addition, GNN contains two main major components graph operation and neural network (NN) operation, both are executed alternately during processing. The interleaved complex processing of GNN poses a challenge for many computational platforms, especially for those without accelerators. Current optimization frameworks solely for deep learning or graph computing cannot achieve good performance on GNN. In this work, we first investigate the performance bottlenecks when handling GNN over multi-core processors. Then, we perform a set of optimization strategies to leverage the capability of multi-core processors for GNN processing. Specifically, we build a set of microkernels for graph operations of GNN with assembly instructions that can exploit the capability of SIMD processing units within a core and implement a task allocator based on a greedy algorithm to balance the workloads between the cores of CPU. In addition, we use some strategies to optimize NN operation for GNN, according to the characteristics of NN operation in GNN. Experimental results show that the proposed methods can achieve a maximum of 2.88x and 4.08x performance improvements for various GNN models on Phytium 2000+ and Kunpeng 920 over the state-of-the-art GNN framework.
0 Replies
Loading