Towards trustworthy predictions from deep neural networks with fast adversarial calibration

Christian Tomani; Florian Buettner

Towards trustworthy predictions from deep neural networks with fast adversarial calibration

Christian Tomani, Florian Buettner

25 Sept 2019 (modified: 12 Oct 2025)ICLR 2020 Conference Blind SubmissionReaders: Everyone

Abstract: To facilitate a wide-spread acceptance of AI systems guiding decision making in real-world applications, trustworthiness of deployed models is key. That is, it is crucial for predictive models to be uncertainty-aware and yield well-calibrated (and thus trustworthy) predictions for both in-domain samples as well as under domain shift. Recent efforts to account for predictive uncertainty include post-processing steps for trained neural networks, Bayesian neural networks as well as alternative non-Bayesian approaches such as ensemble approaches and evidential deep learning. Here, we propose an efficient yet general modelling approach for obtaining well-calibrated, trustworthy probabilities for samples obtained after a domain shift. We introduce a new training strategy combining an entropy-encouraging loss term with an adversarial calibration loss term and demonstrate that this results in well-calibrated and technically trustworthy predictions for a wide range of perturbations. We comprehensively evaluate previously proposed approaches on different data modalities, a large range of data sets, network architectures and perturbation strategies and observe that our modelling approach substantially outperforms existing state-of-the-art approaches, yielding well-calibrated predictions for both in-domain and out-of domain samples.

Keywords: deep learning, uncertainty, calibration, domain shift, robustness

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 2 code implementations](https://www.catalyzex.com/paper/towards-trustworthy-predictions-from-deep/code)

Original Pdf: pdf

10 Replies

Loading