Keywords: OOD detection, Adversarial detection, Extreme Value Theory
Abstract: This paper aims to transform a trained classifier into an abstaining classifier, such
that the latter is provably protected from out-of-distribution and adversarial samples. The proposed Sample-efficient Probabilistic Detection using Extreme Value
Theory (SPADE) approach relies on a Generalized Extreme Value (GEV) model
of the training distribution in the latent space of the classifier. Under mild assumptions, this GEV model allows for formally characterizing out-of-distribution
and adversarial samples and rejecting them. Empirical validation of the approach
is conducted on various neural architectures (ResNet, VGG, and Vision Transformer) and considers medium and large-sized datasets (CIFAR-10, CIFAR-100,
and ImageNet). The results show the stability and frugality of the GEV model and
demonstrate SPADE’s efficiency compared to the state-of-the-art methods.
Primary Area: alignment, fairness, safety, privacy, and societal considerations
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 11146
Loading