Better Uncertainty Calibration via Proper Scores for Classification and Beyond

Sebastian Gregor Gruber; Florian Buettner

Better Uncertainty Calibration via Proper Scores for Classification and Beyond

Sebastian Gregor Gruber, Florian Buettner

Published: 31 Oct 2022, Last Modified: 06 Apr 2025NeurIPS 2022 AcceptReaders: Everyone

Keywords: Calibration, Predictive Uncertainty, Classification, Regression

Abstract: With model trustworthiness being crucial for sensitive real-world applications, practitioners are putting more and more focus on improving the uncertainty calibration of deep neural networks. Calibration errors are designed to quantify the reliability of probabilistic predictions but their estimators are usually biased and inconsistent. In this work, we introduce the framework of \textit{proper calibration errors}, which relates every calibration error to a proper score and provides a respective upper bound with optimal estimation properties. This relationship can be used to reliably quantify the model calibration improvement. We theoretically and empirically demonstrate the shortcomings of commonly used estimators compared to our approach. Due to the wide applicability of proper scores, this gives a natural extension of recalibration beyond classification.

Supplementary Material: pdf

TL;DR: This work compares theoretically and empirically different calibration errors and offers a novel class of calibration errors based on proper scores.

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 2 code implementations](https://www.catalyzex.com/paper/arxiv:2203.07835/code)

15 Replies

Loading