Fine-Tuning with Uncertainty-Aware Priors Makes Vision and Language Foundation Models More Reliable
TL;DR: We fine-tune foundation models with uncertainty-aware priors and show that doing so leads to significantly improved uncertainty quantification on downstream tasks compared to fine-tuning via expected risk minimization.
Abstract: Fine-tuning off-the-shelf pre-trained neural networks has become the default starting point for a wide range of challenging prediction tasks—especially in computer vision and natural language processing, where pre-trained models trained on millions or even billions of data points are publicly available and can be fine-tuned with a moderate compute budget. However, while fine-tuned models have been shown to significantly improve predictive performance compared to models trained from scratch, they can exhibit poor calibration and fail to reliably identify challenging distribution shifts. In this paper, we improve uncertainty quantification in fine-tuned models by constructing a data-driven uncertainty-aware fine-tuning prior that assigns high probability density to parameters that induce predictive functions with high uncertainty on input points that are meaningfully different from the data. We derive a tractable variational objective to perform approximate inference in models with data-driven uncertainty-aware priors and evaluate models fine-tuned with such priors on different transfer learning tasks. We show that fine-tuning with uncertainty-aware priors significantly improves calibration, selective prediction, and semantic shift detection on computer vision and natural language classification tasks.
Submission Number: 50
Loading