Spectral Regularization as a Safety-Critical Inductive Bias

Published: 29 Sept 2025, Last Modified: 24 Oct 2025NeurIPS 2025 - Reliable ML WorkshopEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Adversarial Robustness, AI Safety, Spectral Bias, Gradient Regularization, Fourier Analysis, Deep Learning
TL;DR: This paper introduces Fourier Gradient Regularization (FGR), a physics-inspired training method that improves a model's robustness to adversarial attacks by penalizing high-frequency components in its input-gradients.
Abstract: Deep neural networks exhibit a "spectral bias," a tendency to learn low-frequency functions more easily than high-frequency ones. This creates a critical vulnerability: adversarial attacks, which introduce subtle, high-frequency perturbations to inputs that cause catastrophic model failures. This paper introduces Fourier Gradient Regularization (FGR), a novel, physics-inspired training method that directly addresses this vulnerability. By penalizing the high-frequency components of the model's input-gradients during training, analogous to a coarse-graining procedure in physics, FGR induces a "smoothness" prior, forcing the model to become less sensitive to the very perturbations adversaries exploit. Our empirical results on CIFAR-10 with a ResNet-18 architecture demonstrate that FGR can more than double adversarial robustness under Projected Gradient Descent (PGD) attacks while maintaining near-identical performance on clean data, showcasing a highly favorable accuracy-robustness trade-off.
Submission Number: 63
Loading