Keywords: adversarial sampling, adversarial purification, diffusion model, denoising diffusion codebook model
TL;DR: Comprehensive analysis of the adversarial robustness of diffusion codebook sampling
Abstract: Diffusion models have demonstrated superior performance in both unrestricted adversarial attacks and adversarial purification. However, adversarial guidance can distort the benign sampling process, steering diffusion models toward adversarial rather than benign data distributions, thereby degrading generation performance. Meanwhile, adversarial purification struggles to isolate adversarial influence from the diffusion process, limiting its ability to recover the true benign sample. Recently, denoising diffusion codebook models have introduced a novel sampling paradigm, replacing random noise with selections from a predefined codebook. This enables an adversarially isolated sampling mechanism for diffusion models. In this paper, we propose novel frameworks for adversarial sampling and purification based on diffusion codebook sampling. Our adversarial sampling constructs adversarial examples by selectively drawing from the Gaussian noise codebook, while our purification leverages implicit guidance to suppress adversarial influence and restore benign samples. Furthermore, we introduce several enhancements to strengthen defense performance. Extensive experiments demonstrate that our method consistently outperforms state-of-the-art approaches in both adversarial sampling and purification, offering a promising direction for advancing the adversarial robustness of diffusion models.
Supplementary Material: pdf
Primary Area: alignment, fairness, safety, privacy, and societal considerations
Submission Number: 6612
Loading