Keywords: uncertainty estimation, variational information bottleneck
Abstract: While deep neural networks for classification have shown impressive predictive performance, e.g. in image classification, they generally tend to be overconfident. We start from the observation that popular methods for reducing overconfidence by regularizing the distribution of outputs or intermediate variables achieve better calibration by sacrificing the separability of correct and incorrect predictions, another important facet of uncertainty estimation. To circumvent this, we propose a novel method that builds upon the distributional alignment of the variational information bottleneck and encourages assigning lower confidence to samples from the latent prior. Our experiments show that this simultaneously improves prediction accuracy and calibration compared to a multitude of output regularization methods without impacting the uncertainty-based separability in multiple classification settings, including under distributional shift.
One-sentence Summary: Novel uncertainty estimation method that retains the benefits of regularization of output and intermediate variables without hurting the entropy-based separability of correct and incorrect predictions.
Supplementary Material: zip
5 Replies
Loading