Benchmarking Uncertainty Disentanglement: Specialized Uncertainties for Specialized Tasks

Published: 26 Sept 2024, Last Modified: 13 Nov 2024NeurIPS 2024 Track Datasets and Benchmarks SpotlightEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Uncertainty Quantification, Uncertainty Disentanglement, Aleatoric Uncertainty, Epistemic Uncertainty, Abstained Prediction, Out-of-Distribution Detection
TL;DR: We evaluate recent uncertainty quantifiers on various practical tasks to determine if they can provide disentangled uncertainty estimates.
Abstract: Uncertainty quantification, once a singular task, has evolved into a spectrum of tasks, including abstained prediction, out-of-distribution detection, and aleatoric uncertainty quantification. The latest goal is disentanglement: the construction of multiple estimators that are each tailored to one and only one source of uncertainty. This paper presents the first benchmark of uncertainty disentanglement. We reimplement and evaluate a comprehensive range of uncertainty estimators, from Bayesian over evidential to deterministic ones, across a diverse range of uncertainty tasks on ImageNet. We find that, despite recent theoretical endeavors, no existing approach provides pairs of disentangled uncertainty estimators in practice. We further find that specialized uncertainty tasks are harder than predictive uncertainty tasks, where we observe saturating performance. Our results provide both practical advice for which uncertainty estimators to use for which specific task, and reveal opportunities for future research toward task-centric and disentangled uncertainties. All our reimplementations and weights and biases logs are available at https://github.com/bmucsanyi/untangle.
Submission Number: 125
Loading