Hard Regularization to Prevent Collapse in Online Deep Clustering without Data AugmentationDownload PDF

Published: 01 Feb 2023, Last Modified: 13 Feb 2023ICLR 2023 Conference Withdrawn SubmissionReaders: Everyone
Keywords: deep learning, clustering, online
TL;DR: regularizing hard cluster assignments with a Bayesian optimization problem to prevent collapse in online deep clustering without data augmentaiton
Abstract: Online deep clustering refers to the joint use of a feature extraction network and a clustering model to assign cluster labels to each new data point or batch as it is processed. While faster and more versatile than offline methods, online clus- tering can easily reach the collapsed solution where the encoder maps all inputs to the same point and all are put into a single cluster. Successful existing mod- els have employed various techniques to avoid this problem, most of which re- quire data augmentation or which aim to make the average soft assignment across the dataset the same for each cluster. We propose a method that does not require data augmentation, and that, different from existing methods, regularizes the hard assignments. Using a Bayesian framework, we derive an intuitive optimization objective that can be straightforwardly included in the training of the encoder net- work. Tested on four image datasets, we show that it consistently avoids collapse more robustly than other methods and that it leads to more accurate clustering. We also conduct further experiments and analysis justifying our choice to regularize the hard cluster assignments.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Supplementary Material: zip
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Submission Guidelines: Yes
Please Choose The Closest Area That Your Submission Falls Into: Unsupervised and Self-supervised learning
11 Replies

Loading