Towards Adversarial Purification using Denoising AutoEncoders

Dvij Rajesh Kalaria; Aritra Hazra; Partha Pratim Chakrabarti

Towards Adversarial Purification using Denoising AutoEncoders

Dvij Rajesh Kalaria, Aritra Hazra, Partha Pratim Chakrabarti

Published: 05 Dec 2022, Last Modified: 05 May 2023MLSW2022Readers: Everyone

Abstract: With the rapid advancement and increased use of deep learning models in image identification, security becomes a major concern to their deployment in safety-critical systems. The deep learning architectures are often susceptible to adversarial attacks which are often obtained by making subtle perturbations to normal images, which are mostly imperceptible to humans, but can seriously confuse the state-of-the-art machine learning models. We propose a framework, named APuDAE, leveraging Denoising AutoEncoders (DAEs) to purify these samples by using them in an adaptive way and thus improve the classification accuracy of the target classifier networks. We also show how using DAEs adaptively instead directly, improves classification accuracy further and is more robust to the possibility of designing adaptive attacks to fool them. We demonstrate our results over MNIST, CIFAR-10, ImageNet dataset and show how our framework APuDAE provides comparable and in most cases better performance to the baseline methods in purifying adversaries.

1 Reply

Loading