Modified K-means Algorithm with Local Optimality Guarantees

Mingyi Li; Michael R. Metel; Akiko Takeda

Modified K-means Algorithm with Local Optimality Guarantees

Mingyi Li, Michael R. Metel, Akiko Takeda

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

TL;DR: We show that the K-means algorithm does not always converge to a locally optimal solution, and propose a modified algorithm with local optimality guarantees, with the same computational complexity as the original K-means algorithm.

Abstract: The K-means algorithm is one of the most widely studied clustering algorithms in machine learning. While extensive research has focused on its ability to achieve a globally optimal solution, there still lacks a rigorous analysis of its local optimality guarantees. In this paper, we first present conditions under which the K-means algorithm converges to a locally optimal solution. Based on this, we propose simple modifications to the K-means algorithm which ensure local optimality in both the continuous and discrete sense, with the same computational complexity as the original K-means algorithm. As the dissimilarity measure, we consider a general Bregman divergence, which is an extension of the squared Euclidean distance often used in the K-means algorithm. Numerical experiments confirm that the K-means algorithm does not always find a locally optimal solution in practice, while our proposed methods provide improved locally optimal solutions with reduced clustering loss. Our code is available at https://github.com/lmingyi/LO-K-means.

Lay Summary: The K-means algorithm is one of the most widely used algorithms for clustering datasets into homogeneous groups. It also seems to be largely accepted, from at least the 1980s, that the K-means algorithm converges to locally optimal solutions. In this work we first show, by counterexample, that this is not true in general. We then develop simple modifications to the K-means algorithm which guarantee that it converges to a locally optimal solution, while also keeping the same computational complexity as the original K-means algorithm. We performed extensive experiments on both synthetic and real-world datasets and confirmed that the K-means algorithm does not always converge to locally optimal solutions in practice, while also verifying that our algorithms generate improved locally optimal solutions with reduced clustering loss.

Link To Code: https://github.com/lmingyi/LO-K-means

Primary Area: General Machine Learning->Clustering

Keywords: K-means algorithm, local optimality, Bregman divergence

Submission Number: 11456

Loading