HUNIPU: Efficient Hungarian Algorithm on IPUs

Published: 01 Jan 2024, Last Modified: 10 Feb 2025ICDEW 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: The Hungarian algorithm is a fundamental approach for a number of matching problems that find corresponding elements in two sets. For instance, the optimal alignment of proteins. Due its computational complexity, this algorithm takes several hours for only a few thousand elements on a desktop computer. The most common solution to increase the efficiency is parallelization using GPUs. However, GPUs run the same operation in all the threads in a warp. This constraint is a limitation for the Hungarian algorithm that requires to find, in multiple steps, the best matching among a variable number of candidates. In this paper, we introduce HUNIPU, a parallel version of the Hungarian algorithm optimized for a novel architecture called Intelligence Processing Unit (IPU). IPUs is a promising avenue for the Hungarian algorithm since each core can independently perform a distinct operation. To benefit from the computational power of the IPU, we design HUNIPU to exploit the computational model of the IPU. We provide a smooth introduction to the IPU model and its challenges and show its flexibility and potential for numerous algorithmic advances. In our extensive experiments, the HUNIPU outperforms the best GPU algorithm on cuttingedge A100 GPU running $6 \times$ faster on synthetic datasets and up to $32 \times$ on real datasets for graph alignment.
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview