Keywords: classification, regularization, information bottleneck, latent representations.
TL;DR: We develop a method to induce the collapse of same-class latent representations into single points in deep neural network classifiers, thereby enhancing robustness, generalization and reliability of the network.
Abstract: The information-bottleneck principle suggests that the foundation of learning lies in the ability to create compact representations. In machine learning, this goal can be formulated as a Lagrangian optimization problem, where the mutual information between the input and latent representations must be minimized without compromising the correctness of the model's predictions.
Unfortunately, mutual information is difficult to compute in deterministic deep neural network classifiers, which greatly limits the application of this approach to challenging scenarios. In this paper, we tackle this problem from a different perspective that does not involve direct computation of the mutual information. We develop a method that induces the collapse of latent representations belonging to the same class into a single point.
This point collapse not only significantly reduces the entropy of the latent distribution, thereby creating an information bottleneck that correlates with improved generalization, but also makes the network Lipschitz, offering guarantees for enhanced robustness.
Our method is straightforward to implement. We demonstrate that it substantially improves the network's robustness, provides a small yet statistically significant increase in generalization, and enhances the network's ability to detect misclassifications.
Primary Area: optimization
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 7528
Loading