Selective Classification Under Distribution Shifts

Published: 23 Oct 2024, Last Modified: 23 Oct 2024Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: In selective classification (SC), a classifier abstains from making predictions that are likely to be wrong to avoid excessive errors. To deploy imperfect classifiers---imperfect either due to intrinsic statistical noise of data or for robustness issue of the classifier or beyond---in high-stakes scenarios, SC appears to be an attractive and necessary path to follow. Despite decades of research in SC, most previous SC methods still focus on the ideal statistical setting only, i.e., the data distribution at deployment is the same as that of training, although practical data can come from the wild. To bridge this gap, in this paper, we propose an SC framework that takes into account distribution shifts, termed generalized selective classification, that covers label-shifted (or out-of-distribution) and covariate-shifted samples, in addition to typical in-distribution samples, the first of its kind in the SC literature. We focus on non-training-based confidence-score functions for generalized SC on deep learning (DL) classifiers and propose two novel margin-based score functions. Through extensive analysis and experiments, we show that our proposed score functions are more effective and reliable than the existing ones for generalized SC on a variety of classification tasks and DL classifiers.
Submission Length: Long submission (more than 12 pages of main content)
Changes Since Last Submission: - In Sec. 2.3, we added paragraph "Other related concepts" to clarify the difference between generalized SC and other related concepts. - In Sec. 2.3 paragraph "Prior work on SC with distribution shifts", we added discussion about some existing literature that was undiscussed in our previous submitted manuscript, including the [NeurIPS workshop paper](https://openreview.net/forum?id=FiqXqKR26c) suggested by Reviewer HfXr. The above two changes highlighted the distinction between our work and existing literature more explicitly and are intended to reiterate the novelty of this paper. - For Fig. 3., we included the full sets of figures for all three cases and applied color code for better readability. - In Appendix B, we added the derivation steps from Eq. 39 to Eq. 40 for clarity. - In Appendix C, we added a new paragraph in the beginning to better illustrate the purpose of this section. The above three changes are intended to improve the readability. --- - Updated (10/22/2024) format of TMLR including date and url.
Code: https://github.com/sun-umn/sc_with_distshift
Assigned Action Editor: ~Yonatan_Bisk1
Submission Number: 2648
Loading