Adversarial Attack Detection Under Realistic Constraints

Marine Picot; Nathan Noiry; Pablo Piantanida; Pierre Colombo

Adversarial Attack Detection Under Realistic Constraints

Marine Picot, Nathan Noiry, Pablo Piantanida, Pierre Colombo

22 Sept 2022 (modified: 13 Feb 2023)ICLR 2023 Conference Withdrawn SubmissionReaders: Everyone

Keywords: Adversarial attacks, Detection, Vision transformers, Safety AI

TL;DR: We propose a simple, real-time, softmax-based detection method for adversarial samples.

Abstract: While adversarial attacks are a serious threat for neural networks safety, existing defense mechanisms remain very limited regarding their applicability to real-world settings. Any industrial-driven attack detector is expected to meet three unavoidable requirements: (R1) being adapted to black-box scenario where the user has only access to the predicted probabilities, (R2) making fast inference and (R3) not involving any training phase. In this paper, we introduce REFEREE, the first detector that meets all these requirements while improving state-of-the-art performances. It leverages the concept of information projections (I-projection), which generalizes ideas coming from out-of-distribution detection and allows to extract relevant information contained in the softmax outputs of a network. Our extensive experiments demonstrates that REFEREE improves upon existing methods while considerably reducing the inference time: it requires less than 0.05 seconds by test input, which is up to 400 times faster than former methods. This makes REFEREE an excellent candidate for adversarial attacks detection in real-world applications.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Submission Guidelines: Yes

Please Choose The Closest Area That Your Submission Falls Into: Social Aspects of Machine Learning (eg, AI safety, fairness, privacy, interpretability, human-AI interaction, ethics)

Supplementary Material: zip

4 Replies

Loading