Measuring Calibration in Deep Learning

Jeremy Nixon; Mike Dusenberry; Ghassen Jerfel; Linchuan Zhang; Dustin Tran

Measuring Calibration in Deep Learning

Jeremy Nixon, Mike Dusenberry, Ghassen Jerfel, Linchuan Zhang, Dustin Tran

25 Sept 2019 (modified: 22 Jun 2025)ICLR 2020 Conference Blind SubmissionReaders: Everyone

Abstract: Overconfidence and underconfidence in machine learning classifiers is measured by calibration: the degree to which the probabilities predicted for each class match the accuracy of the classifier on that prediction. We propose two new measures for calibration, the Static Calibration Error (SCE) and Adaptive Calibration Error (ACE). These measures take into account every prediction made by a model, in contrast to the popular Expected Calibration Error.

Keywords: Deep Learning, Multiclass Classification, Classification, Uncertainty Estimation, Calibration

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 2 code implementations](https://www.catalyzex.com/paper/measuring-calibration-in-deep-learning/code)

Original Pdf: pdf

6 Replies

Loading