Accelerating Clustering and Cluster Quality Evaluation in Large-Scale Problems Through Recursive Updates
Abstract: Clustering algorithms often face scalability bottlenecks due to redundant computations during iterative updates. In this work, we propose a general-purpose optimisation technique based on recursive mean updates, which reduces the computational cost of cluster centroid or medoid updates from linear to constant time. We apply this principle to two commonly used clustering paradigms. First, we introduce R-Means, a fast variant of K-means that recursively updates cluster centroids as data points are reassigned, avoiding repeated full-cluster scans. Second, we present ReSil, an efficient method for computing silhouette scores recursively, significantly accelerating silhouette-based validation and optimisation. Building on these, we propose ReSilC, a silhouette-driven medoid clustering algorithm inspired by PAMSil, which leverages both recursive silhouette and medoid updates to achieve optimal cluster validity at a fraction of the computational cost. Across a suite of real-world and synthetic datasets, we show that our methods consistently match or improve clustering quality while offering substantial speed-ups compared to standard implementations. Our results highlight that recursive update strategies offer a general and effective route to improving clustering performance in both objective-driven and validation-oriented settings.
Submission Type: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Ilan_Shomorony1
Submission Number: 5996
Loading