Uncertainty-Aware Gradient Descent via Online Bootstrapping

ICLR 2026 Conference Submission14081 Authors

18 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Optimization; Bootstrapping; Model Uncertainty; Gradient Descent; Regularization; Deep Learning; Calibration;
TL;DR: We make bootstrapping practical for deep learning by training two "twin" models whose divergence provides a live uncertainty signal that regularizes the training process to find more robust solutions.
Abstract: Standard gradient descent methods yield point estimates with no measure of confidence. This limitation is acute in overparameterized and low-data regimes, where models have many parameters relative to available data and can easily overfit. Bootstrapping is a classical statistical framework for uncertainty estimation based on resampling, but naively applying it to deep learning is impractical: it requires training many replicas, produces post-hoc estimates that cannot guide learning, and implicitly assumes comparable optima across runs—an assumption that fails in non-convex landscapes. We introduce Twin-Bootstrap Gradient Descent, a resampling-based training procedure that integrates uncertainty estimation into optimization. Two identical models are trained in parallel on independent bootstrap samples, and a periodic mean-reset keeps both trajectories in the same basin so that their divergence reflects local (within-basin) uncertainty. During training, we use this estimate to sample weights in an adaptive, data-driven way, providing regularization that favors flatter solutions. In deep neural networks and complex high-dimensional inverse problems, the approach improves calibration and generalization and yields interpretable uncertainty maps.
Primary Area: optimization
Submission Number: 14081
Loading