Randomized Feature Squeezing against  Unseen  $ {l_p} $ Attacks without Adversarial Training

Randomized Feature Squeezing against Unseen $ {l_p} $ Attacks without Adversarial Training

ICLR 2026 Conference Submission14939 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Randomized Feature Squeezing, Unseen $ {l_p} $ Attacks

Abstract: Deep learning has made tremendous progress in the last decades; however, it is not robust to adversarial attacks. The most effective approach is perhaps adversarial training, although it is impractical because it requires prior knowledge about the attackers and incurs high computational costs. In this paper, we propose a novel approach that can train a robust network only through standard training with clean images without awareness of the attacker's strategy. We add a specially designed network input layer, which accomplishes a randomized feature squeezing to reduce the malicious perturbation. It achieves excellent robustness against unseen ${l_0,l_1,l_2}$ and $ {l_\infty} $ attacks at one time in terms of the computational cost of the attacker versus the defender through just 100/50 epochs of standard training with clean images in CIFAR-10/ImageNet. The thorough experimental results validate the high performance. Moreover, it can also defend against unlearnable examples generated by One-Pixel Shortcut which breaks down the adversarial training approach.

Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning

Submission Number: 14939

Loading