Improved Detection of Adversarial Attacks via Penetration Distortion MaximizationDownload PDF

25 Sept 2019 (modified: 05 May 2023)ICLR 2020 Conference Blind SubmissionReaders: Everyone
TL;DR: Adversarial detection method based on separating class clusters in the embedding space.
Abstract: This paper is concerned with the defense of deep models against adversarial at- tacks. We develop an adversarial detection method, which is inspired by the cer- tificate defense approach, and captures the idea of separating class clusters in the embedding space so as to increase the margin. The resulting defense is intuitive, effective, scalable and can be integrated into any given neural classification model. Our method demonstrates state-of-the-art detection performance under all threat models.
Keywords: Adversarial Examples, Adversarial Attacks, Adversarial Defense, White-Box threat models
Original Pdf: pdf
7 Replies

Loading