Keywords: deep learning, clustering, online
TL;DR: regularizing hard cluster assignments with a Bayesian optimization problem to prevent collapse in online deep clustering without data augmentaiton
Abstract: Online deep clustering refers to the joint use of a feature extraction network and
a clustering model to assign cluster labels to each new data point or batch as it
is processed. While faster and more versatile than offline methods, online clus-
tering can easily reach the collapsed solution where the encoder maps all inputs
to the same point and all are put into a single cluster. Successful existing mod-
els have employed various techniques to avoid this problem, most of which re-
quire data augmentation or which aim to make the average soft assignment across
the dataset the same for each cluster. We propose a method that does not require
data augmentation, and that, different from existing methods, regularizes the hard
assignments. Using a Bayesian framework, we derive an intuitive optimization
objective that can be straightforwardly included in the training of the encoder net-
work. Tested on four image datasets, we show that it consistently avoids collapse
more robustly than other methods and that it leads to more accurate clustering. We
also conduct further experiments and analysis justifying our choice to regularize
the hard cluster assignments.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Supplementary Material: zip
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Submission Guidelines: Yes
Please Choose The Closest Area That Your Submission Falls Into: Unsupervised and Self-supervised learning
11 Replies
Loading