Latent Point Collapse Induces an Information Bottleneck in Deep Neural Network Classifiers

26 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0
Keywords: classification, regularization, information bottleneck, latent representations.
TL;DR: We develop a method to induce the collapse of same-class latent representations into single points in deep neural network classifiers, thereby enhancing robustness, generalization and reliability of the network.
Abstract: The information-bottleneck principle suggests that the foundation of learning lies in the ability to create compact representations. In machine learning, this goal can be formulated as a Lagrangian optimization problem, where the mutual information between the input and latent representations must be minimized without compromising the correctness of the model's predictions. Unfortunately, mutual information is difficult to compute in deterministic deep neural network classifiers, which greatly limits the application of this approach to challenging scenarios. In this paper, we tackle this problem from a different perspective that does not involve direct computation of the mutual information. We develop a method that induces the collapse of latent representations belonging to the same class into a single point. This point collapse not only significantly reduces the entropy of the latent distribution, thereby creating an information bottleneck that correlates with improved generalization, but also makes the network Lipschitz, offering guarantees for enhanced robustness. Our method is straightforward to implement. We demonstrate that it substantially improves the network's robustness, provides a small yet statistically significant increase in generalization, and enhances the network's ability to detect misclassifications.
Primary Area: optimization
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 7528
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview