Learning from Sample Stability for Deep Clustering

Zhixin Li; Yuheng Jia; Hui LIU; Junhui Hou

Learning from Sample Stability for Deep Clustering

Zhixin Li, Yuheng Jia, Hui LIU, Junhui Hou

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

TL;DR: This paper uses the sample stability to improve deep clustering performance.

Abstract: Deep clustering, an unsupervised technique independent of labels, necessitates tailored supervision for model training. Prior methods explore supervision like similarity and pseudo labels, yet overlook individual sample training analysis. Our study correlates sample stability during unsupervised training with clustering accuracy and network memorization on a per-sample basis. Unstable representations across epochs often lead to mispredictions, indicating difficulty in memorization and atypicality. Leveraging these findings, we introduce supervision signals for the first time based on sample stability at the representation level. Our proposed strategy serves as a versatile tool to enhance various deep clustering techniques. Experiments across benchmark datasets showcase that incorporating sample stability into training can improve the performance of deep clustering. The code is available at https://github.com/LZX-001/LFSS.

Lay Summary: In this paper, we introduced a new approach to grouping data that uses the idea of how consistent the representation of each data point behaves during the learning process. After running many tests, we found that how stable a data point is closely relates to how accurately it can be grouped and how well the model remembers it. Based on these observations, we developed a new method, which takes advantage of data point stability at both the individual and group levels to improve the overall performance of deep learning-based grouping techniques.

Primary Area: General Machine Learning->Clustering

Keywords: Deep clustering, sample stability

Submission Number: 6351

Loading