Faster Approximation Algorithms for k-Center via Data Reduction

Arnold Filtser, Shaofeng H.-C. Jiang, Yi Li, Anurag Murty Naredla, Ioannis Psarros, Qiaoyuan Yang, Qin Zhang

Published: 01 Jan 2025, Last Modified: 17 May 2025CoRR 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: We study efficient algorithms for the Euclidean $k$-Center problem, focusing on the regime of large $k$. We take the approach of data reduction by considering $\alpha$-coreset, which is a small subset $S$ of the dataset $P$ such that any $\beta$-approximation on $S$ is an $(\alpha + \beta)$-approximation on $P$. We give efficient algorithms to construct coresets whose size is $k \cdot o(n)$, which immediately speeds up existing approximation algorithms. Notably, we obtain a near-linear time $O(1)$-approximation when $k = n^c$ for any $0 < c < 1$. We validate the performance of our coresets on real-world datasets with large $k$, and we observe that the coreset speeds up the well-known Gonzalez algorithm by up to $4$ times, while still achieving similar clustering cost. Technically, one of our coreset results is based on a new efficient construction of consistent hashing with competitive parameters. This general tool may be of independent interest for algorithm design in high dimensional Euclidean spaces.