SAIF: Sparse Adversarial and Imperceptible Attack Framework

Tooba Imtiaz; Morgan R Kohler; Jared F Miller; Zifeng Wang; Masih Eskandar; Mario Sznaier; Octavia Camps; Jennifer Dy

SAIF: Sparse Adversarial and Imperceptible Attack Framework

Tooba Imtiaz, Morgan R Kohler, Jared F Miller, Zifeng Wang, Masih Eskandar, Mario Sznaier, Octavia Camps, Jennifer Dy

Published: 29 Jul 2025, Last Modified: 29 Jul 2025Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: Adversarial attacks hamper the decision-making ability of neural networks by perturbing the input signal. For instance, adding calculated small distortions to images can deceive a well-trained image classification network. In this work, we propose a novel attack technique called \textbf{S}parse \textbf{A}dversarial and \textbf{I}mperceptible Attack \textbf{F}ramework (SAIF). Specifically, we design imperceptible attacks that contain low-magnitude perturbations at a few pixels and leverage these sparse attacks to reveal the vulnerability of classifiers. We use the Frank-Wolfe (conditional gradient) algorithm to simultaneously optimize the attack perturbations for bounded magnitude and sparsity with $O(1/\sqrt{T})$ convergence. Empirical results show that SAIF computes highly imperceptible and interpretable adversarial examples, and largely outperforms state-of-the-art sparse attack methods on ImageNet and CIFAR-10.

Submission Length: Regular submission (no more than 12 pages of main content)

Code: https://github.com/toobaimt/SAIF

Supplementary Material: zip

Assigned Action Editor: ~Daniel_M_Roy1

Submission Number: 4435

Loading