Keywords: accuracy estimation, error bounds, distribution shift, unsupervised domain adaptation
TL;DR: We give a bound for estimating accuracy under shift using unlabeled test data. It uses a simple condition which holds effectively 100% of the time. This closes the gap b/w vacuous, verifiable bounds and tighter ones which often overestimate accuracy.
Abstract: We derive an (almost) guaranteed upper bound on the error of deep neural networks under distribution shift using unlabeled test data. Prior methods either give bounds that are vacuous in practice or give \emph{estimates} that are accurate on average but heavily underestimate error for a sizeable fraction of shifts. Our bound requires a simple, intuitive condition which is well justified by prior empirical works and holds in practice effectively 100\% of the time. The bound is inspired by $\hdh$-divergence but is easier to evaluate and substantially tighter, consistently providing non-vacuous guarantees. Estimating the bound requires optimizing one multiclass classifier to disagree with another, for which some prior works have used sub-optimal proxy losses; we devise a "disagreement loss" which is theoretically justified and performs better in practice. Across a wide range of benchmarks, our method gives valid error bounds while achieving average accuracy comparable to competitive estimation baselines.
Submission Number: 15
Loading