RTop-K: Ultra-Fast Row-Wise Top-K Selection for Neural Network Acceleration on GPUs

ICLR 2025 Conference Submission13337 Authors

28 Sept 2024 (modified: 28 Nov 2024)ICLR 2025 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: row-wise topk selection, GPU, CUDA
Abstract: Top-k selection algorithms are fundamental in a wide range of applications, from high-performance computing and information retrieval to big data processing and neural network model training. In this paper, we present RTop-K, a highly efficient parallel row-wise top-k selection algorithm specifically designed for GPUs. RTop-K leverages a binary search-based approach to optimize row-wise top-k selection, providing a scalable and accelerated solution. We conduct a detailed analysis of early stopping in our algorithm, showing that it effectively maintains the testing accuracy of neural network models while substantially improving performance. Our GPU implementation of RTop-K demonstrates superior performance over state-of-the-art row-wise top-k GPU implementations, achieving speed-ups ranging from 4.25× to 9.51× with early stopping, and 3.94× without early stopping. Moreover, RTop-K is capable of accelerating the overall training workflow of MaxK-GNNs, delivering an average speed-up of 9.76% to 31.53% across different models and datasets.
Primary Area: infrastructure, software libraries, hardware, systems, etc.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 13337
Loading