FreqSAM: Saliency-Masked Frequency–Spatial Adversarial Attacks for Stealthy Examples

TMLR Paper8500 Authors

18 Apr 2026 (modified: 16 May 2026)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Deep Neural Networks (DNNs) have achieved remarkable success across computer vision tasks, yet their vulnerability to adversarial perturbations remains a critical security concern. Existing adversarial attacks often operate predominantly in a single representation (spatial or frequency), which can limit control over the effectiveness--imperceptibility trade-off and lead to perceptible artifacts. We introduce FreqSAM (Frequency-enhanced Salient Area Masking), an adversarial attack that combines saliency-guided spatial localization with frequency-aware updates to generate effective adversarial examples with strong perceptual similarity. FreqSAM strategically localizes spatial perturbations within semantically salient regions identified through gradient-based saliency maps, while shaping perturbations using Fast Fourier Transform (FFT) masking. This spatial--frequency design targets a strong effectiveness--imperceptibility trade-off under standard norm constraints. Experiments on ImageNet across multiple architectures show that FreqSAM achieves high white-box success rates while improving visual fidelity as measured by $L_2$, SSIM, and PSNR, and it exhibits moderate black-box transferability. We further evaluate FreqSAM under several common defense settings, including adversarially trained and augmentation-based models. Our approach highlights that common ImageNet models and several robustness baselines remain vulnerable to jointly spatial--frequency constrained perturbations, motivating defenses and evaluations that consider multi-domain attack vectors.
Submission Type: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Venkatesh_Babu_Radhakrishnan2
Submission Number: 8500
Loading