DSpar: An Embarrassingly Simple Strategy for Efficient GNN training and inference via Degree-based Sparsification

Published: 14 Jul 2023, Last Modified: 14 Jul 2023Accepted by TMLREveryoneRevisionsBibTeX
Abstract: Running Graph Neural Networks (GNNs) on large graphs suffers from notoriously inefficiency. This is attributed to the sparse graph-based operations, which is hard to be accelerated by community hardware, e.g., GPUs and CPUs. One potential solution is to ``sketch'' the original graph by removing unimportant edges, then both the training and inference process are executed on the sparsified graph with improved efficiency. Traditional graph sparsification work calculates the edge importance score, i.e., effective resistance, from graph topology with theoretical guarantee. However, estimating effective resistance is even more expensive than training GNNs itself. Later, learning-based sparsification methods propose to learn the edge importance from data, but with significant overhead due to the extra learning process. Thus, both of them introduce significant ahead-of-training overhead. In this paper, we experimentally and theoretically prove that effective resistance can be approximated using only the node degree information and achieve similar node presentations on graph with/without sparsification. Based on this finding, we propose DSpar, to sparsify the graph once before training based on only the node degree information with negligible ahead-of-training overhead. In practice, for the training phase, DSpar achieves up to $5.9\times$ faster than baseline with almost no accuracy drop. For the inference phase, DSpar reduces up to $90\%$ latency.
Submission Length: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Yujia_Li1
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Submission Number: 865