A Dynamic Convergence Criterion for Fast K-means Computations

Published: 2024, Last Modified: 08 Jan 2026WISA 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: The K-Means algorithm has effectively promoted the development of intelligent systems and data-driven decision-making through data clustering and analysis. A reasonable convergence judgment directly determines when the model training can be terminated, which heavily affects the model quality. There are many researches for training acceleration and quality improvement, but few focus on the judgment. Currently, the convergence criteria still adopt a centralized judgment strategy for a single loss value. The criterion is simply copied between different optimized K-Means variants, typically like the fast Mini-Batch version and the traditional Full-Batch version. Our analysis reveals that such a design cannot guarantee that different variants converge to the same point, that is, it can result in abnormal situations such as false-positive and over-training. To perform a fair comparison and guarantee the model accuracy, we proposed a new dynamic convergence criterion VF (Vote for Freezing) and optimized version VF+. VF adopts a distributed judgment strategy where each sample can decide whether to participate in training based on the criterion (i.e., freezing itself) or not. Meanwhile, combined with the priority of samples, VF adaptively adjusts the sample freezing threshold which achieves asymptotic withdrawal of samples and accelerates model convergence. VF+ further introduced parameter freezing thresholds and freezing periods to eliminate redundant distance calculations, hence it improves the training efficiency. Experiments on multiple datasets validate the effectiveness of our convergence criterion in terms of training quality and efficiency.
Loading