Adversary Aware Optimization for Robust Defense

Published: 18 Sept 2025, Last Modified: 29 Oct 2025NeurIPS 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Adversarial robustness, Adversarial purification, optimization, Adversarial training
TL;DR: We propose a novel adversarial purification technique based on optimization that regards the attack landscape.
Abstract: Deep neural networks remain highly susceptible to adversarial attacks, where small, subtle perturbations to input images may induce misclassification. We propose a novel optimization-based purification framework that directly removes these perturbations by maximizing a Bayesian-inspired objective combining a pretrained diffusion prior with a likelihood term tailored to the adversarial perturbation space. Our method iteratively refines a given input through gradient-based updates of a combined score-based loss to guide the purification process. Unlike existing optimization-based defenses that treat adversarial noise as generic corruption, our approach explicitly integrates the adversarial landscape into the objective. Experiments performed on CIFAR-10 and CIFAR-100 demonstrate strong robust accuracy against a range of common adversarial attacks. Our work offers a principled test-time defense grounded in probabilistic inference using score-based generative models. Our code can be found at \url{https://github.com/rooshenasgroup/aaopt}.
Supplementary Material: zip
Primary Area: General machine learning (supervised, unsupervised, online, active, etc.)
Submission Number: 24162
Loading