HSampler : Optimizing Multi-GPU GNN Sampling with Collision-Avoid Selection

Yuyang Jin, Jidong Zhai, Kezhao Huang, Weimin Zheng

Published: 01 Jan 2026, Last Modified: 07 Jan 2026CrossrefEveryoneRevisionsCC BY-SA 4.0

Abstract: Graph Neural Network (GNN) has emerged in graph learning tasks in recent years. As the real-world graphs become larger, sampled GNN is widely used in both academia and industry instead of training on the whole graph. As the training stage consumes less execution time due to the small size of sampled subgraphs, the data preparation stage becomes the performance bottleneck, especially the graph sampling stage. Many sampling methods have been proposed to improve the efficiency, but they still suffer from the significant overhead of the high-degree node selection collision in bias sampling. In this paper, we present HSampler , a multi-GPU GNN sampling system. It selects sampled subgraphs with reordering and sliding windows to reduce the repeated trials in biased sampling, thus improving the performance of the graph sampling stage. Evaluation on a node with 4 GPUs shows that HSampler significantly outperforms state-of-the-art systems like DGL and \(P^3\) by 2.1\(\times \) to 6.2\(\times \).

External IDs:doi:10.1007/978-3-032-10459-5_1