Exploring Covariate and Concept Shift for Out-of-Distribution DetectionDownload PDF

Oct 09, 2021 (edited Oct 29, 2021)NeurIPS 2021 Workshop DistShift PosterReaders: Everyone
  • Keywords: out-of-distribution detection, distribution shift
  • TL;DR: We propose to characterize out-of-distribution detection using distribution shift.
  • Abstract: The modeling of what a neural network does not know -- i.e. uncertainty -- is fundamentally important both in terms of theory and practice. This is especially true as the model encounters distribution shift during inference. Bayesian inference has been regarded as the most principled method of uncertainty modeling because it explicitly models two types of uncertainty: \textit{epistemic uncertainty} and aleatoric uncertainty in the form posteriors over parameters and data likelihood respectively. Epistemic uncertainty captures the uncertainty of model parameters due to lack of data, while aleatoric uncertainty captures inherent data ambiguity. Practically, epistemic uncertainty is often assessed by a model's out-of-distribution (OOD) detection performance or calibration, while aleatoric uncertainty can be assessed by in-distribution error detection. Recent attempts to model uncertainty using deterministic models failed to disentangle these two uncertainties due to their non-Bayesian nature. However, it is still possible to capture them empirically in a deterministic model using a combination of density estimation and softmax-entropy. This leaves us the question: how to approach OOD detection/calibration for deterministic (as opposed to Bayesian) and discriminative (as opposed to generative) models? This is arguably the most widely used class of models due to its speed (compared to Bayesian models) and simplicity (compared to generative models). It seems that the conventional association of OOD data with epistemic uncertainty fails under the scope of this type of models, specifically because it does not reason about what has changed in the input distribution and the mechanisms through which these changes affect neural networks and a different perspective is needed to analyze them.
1 Reply