Deciphering the Definition of Adversarial Robustness for post-hoc OOD Detectors

Peter Lorenz; Mario Ruben Fernandez; Jens Müller; Ullrich Koethe

Deciphering the Definition of Adversarial Robustness for post-hoc OOD Detectors

Peter Lorenz, Mario Ruben Fernandez, Jens Müller, Ullrich Koethe

Published: 28 Jun 2024, Last Modified: 25 Jul 2024NextGenAISafety 2024 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: adversarial examples, ood, post-hoc detectors

Abstract: Detecting out-of-distribution (OOD) inputs is critical for safely deploying deep learning models in real-world scenarios. In recent years, many OOD detectors have been developed, and even the benchmarking has been standardized, i.e. OpenOOD. The number of post-hoc detectors is growing fast and showing an option to protect a pre-trained classifier against natural distribution shifts, claiming to be ready for real-world scenarios. However, its efficacy in handling adversarial examples has been neglected in the majority of studies. This paper investigates the adversarial robustness of the 16 post-hoc detectors on several evasion attacks and discuss a roadmap towards adversarial defense in OOD detectors.

Submission Number: 22

Loading