Online Clustering with Nearly Optimal Consistency

Published: 22 Jan 2025, Last Modified: 11 Feb 2025ICLR 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: clustering, online, consistency
TL;DR: (1 + eps)-competitive online clustering algorithms with O(k polylog n) consistency
Abstract: We give online algorithms for $k$-Means(more generally, $(k, z)$-Clustering) with nearly optimal consistency (a notion suggested by Lattanzi & Vassilvitskii (2017)). Our result turns any $\alpha$-approximate offline algorithm for clustering into an $(1+\epsilon)\alpha^2$-competitive online algorithm for clustering with $O(k \text{poly} \log n)$ consistency. This consistency bound is optimal up to $\text{poly} \log(n)$ factors. Plugging in the offline algorithm that returns the exact optimal solution, we obtain the first $(1 + \epsilon)$-competitive online algorithm for clustering that achieves a linear in $k$ consistency. This simultaneously improves several previous results (Lattanzi & Vassilvitskii, 2017; Fichtenberger et al., 2021). We validate the performance of our algorithm on real datasets by plugging in the practically efficient $k$-Means++ algorithm. Our online algorithm makes $k$-Means++ achieve good consistency with little overhead to the quality of solutions.
Supplementary Material: zip
Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 7997
Loading