Abstract: Highlights•We explore evaluation metrics for uncertainty quantification.•We create toy datasets that highlight different sources of uncertainty.•Using our toy datasets, we compare and contrast metrics for uncertainty.•We evaluate: AUSE, Spearman Correlation, Calibration Error, and NLL.•Results: AUSE, NLL, Calibration error are good metrics with different strengths.
Loading