Incorruptible Neural Networks: Training Models that can Generalize to Large Internal Perturbations

Incorruptible Neural Networks: Training Models that can Generalize to Large Internal Perturbations

ICLR 2026 Conference Submission16559 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Noise Robustness, Sharpness-Aware Minimization, Weight Perturbations

TL;DR: We explore the use of sharpness-aware minimization (SAM) and random weight perturbation as tools for training neural networks robust to weight-space perturbations.

Abstract: Flat regions of the neural network loss landscape have long been hypothesized to correlate with better generalization properties. A closely related but distinct problem is training models that are robust to internal perturbations to their weights, which may be an important need for future low-power hardware platforms. Several methods have been proposed to guide optimization toward improved generalization, such as sharpness-aware minimization (SAM) and random-weight perturbation (RWP), which rely on either adversarial or random perturbations, respectively. In this paper, we explore how to adapt these approaches to find minima robust to a wide variety of random corruptions to weights. First, we evaluate SAM/RWP across a wide variety of noise settings, and in doing so establish that over-regularization during training is key to finding optimally-robust minima. At the same time, we also observe that large perturbations lead to a vanishing gradient effect caused by unevenness in the loss landscape, an effect particularly pronounced in SAM. Quantifying this effect, we map out a general performance trend of SAM and RWP, determining that SAM works best for robustness to small perturbations, whereas RWP works best for large perturbations. Lastly, to overcome the deleterious vanishing gradient effect during training, we propose a dynamic perturbation schedule which matches the natural evolution of the loss landscape and produces minima more noise-robust than otherwise possible.

Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning

Submission Number: 16559

Loading