Collision Cross-entropy for Soft Class Labels and Entropy-based Clustering

13 May 2024 (modified: 06 Nov 2024)Submitted to NeurIPS 2024EveryoneRevisionsBibTeXCC BY 4.0
Keywords: collision cross entropy, entropy-based clustering
Abstract: We propose ''collision cross-entropy'' as a robust alternative to Shannon's cross-entropy (CE) loss when class labels are represented by soft categorical distributions y. In general, soft labels can naturally represent ambiguous targets in classification. They are particularly relevant for self-labeled clustering methods, where latent pseudo-labels $y$ are jointly estimated with the model parameters and uncertainty is prevalent. In case of soft labels $y$, Shannon's CE teaches the model predictions $\sigma$ to reproduce the uncertainty in each training example, which inhibits the model's ability to learn and generalize from these examples. As an alternative loss, we propose the negative log of ``collision probability'' that maximizes the chance of equality between two random variables, predicted class and unknown true class, whose distributions are $\sigma$ and $y$. We show that it has the properties of a generalized CE. The proposed collision CE agrees with Shannon's CE for one-hot labels $y$, but the training from soft labels differs. For example, unlike Shannon's CE, data points where $y$ is a uniform distribution have zero contribution to the training. Collision CE significantly improves classification supervised by soft uncertain targets. Unlike Shannon's, collision CE is symmetric for $y$ and $\sigma$, which is particularly relevant when both distributions are estimated in the context of self-labeled clustering. Focusing on discriminative deep clustering where self-labeling and entropy-based losses are dominant, we show that the use of collision CE improves the state-of-the-art. We also derive an efficient EM algorithm that significantly speeds up the pseudo-label estimation with collision CE.
Primary Area: Machine vision
Submission Number: 7623
Loading