Keywords: Calibration, decision making, uncertainty quantification, indistinguishability
Abstract: When facing uncertainty, decision-makers want predictions they can trust. A machine learning provider can convey confidence to decision-makers by guaranteeing their predictions are distribution calibrated--- amongst the inputs that receive a predicted vector of class probabilities q, the actual distribution over classes is given by q. For multi-class prediction problems, however, directly optimizing predictions under distribution calibration tends to be infeasible, requiring sample complexity that grows exponentially in the number of classes C. In this work, we introduce a new notion---decision calibration---that requires the predicted distribution and true distribution over classes to be ``indistinguishable'' to downstream decision-makers. This perspective gives a new characterization of distribution calibration: a predictor is distribution calibrated if and only if it is decision calibrated with respect to all decision-makers. Our main result shows that under a mild restriction, unlike distribution calibration, decision calibration is actually feasible. We design a recalibration algorithm that provably achieves decision calibration efficiently, provided that the decision-makers have a bounded number of actions (e.g., polynomial in C). We validate our recalibration algorithm empirically: compared to existing methods, decision calibration improves decision-making on skin lesion and ImageNet classification with modern neural network predictors.
Supplementary Material: pdf
Code Of Conduct: I certify that all co-authors of this work have read and commit to adhering to the NeurIPS Statement on Ethics, Fairness, Inclusivity, and Code of Conduct.