The Pitfalls and Promise of Conformal Inference Under Adversarial Attacks

15 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: societal considerations including fairness, safety, privacy
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: Adversarial Robustness; Prediction Uncertainty
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
TL;DR: We investigate the pitfalls and promise of conformal inference under standard adversarial attacks and propose uncertainty-reducing adversarial training to learn an adversarially robust model with reduced prediction set sizes.
Abstract: In safety-critical applications such as medical imaging and autonomous driving, where decisions have profound implications for patient health and road safety, it is imperative to maintain both high adversarial robustness to protect against potential adversarial attacks and reliable uncertainty quantification in decision-making. With extensive research focused on enhancing adversarial robustness through various forms of improved adversarial training (AT), a notable knowledge gap remains concerning the uncertainty inherent in adversarially trained models. To address this gap, this study investigates the uncertainty of deep learning models by examining the performance of conformal prediction (CP) in the context of standard adversarial attacks within the adversarial defense community. It is first unveiled that existing conformal prediction methods fail under the commonly used $l_{\infty}$-norm bounded attack if the model is not adversarially trained, which underpins the importance of adversarial training for CP. Our paper next demonstrates that the prediction set size of CP using adversarially trained models with AT variants is often worse than using standard AT, which inspires us to research into CP-efficient AT for improved prediction set size. Our empirical study finds two factors are strongly correlated with the efficiency of CP: 1) \emph{predictive entropy} and 2) \emph{distribution of the true class probability ranking (TCPR)}. Based on the two observations, we propose the Uncertainty-Reducing AT (AT-UR) to learn an adversarially robust and CP-efficient model with \emph{entropy minimization} and \emph{Beta importance weighting}. Theoretically, this paper presents generalization error analysis for Beta importance weighting indicating that the proposed UR-AT can potentially learn a model with improved generalization. Empirically, we demonstrate the substantially improved CP-efficiency of our method on four image classification datasets compared with several popular AT baselines.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 163
Loading