Boosting Adam-like Optimizers with Signal-to-Noise Ratio Guided Updates

Boosting Adam-like Optimizers with Signal-to-Noise Ratio Guided Updates

ICLR 2026 Conference Submission19708 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Optimization, Deep Learnin

Abstract: The Adam optimizer remains the default choice in deep learning, offering reliable performance across diverse architectures and tasks. In this work, we reinterpret Adam from a signal-processing perspective—viewing its gradient update as a momentum estimate normalized by noise amplitude—and propose a simple modification: replacing the second raw moment with the second central moment (variance). We show that centering provides a more accurate estimate of noise amplitude, allowing the optimizer to normalize the impact of gradient noise uniformly across the loss landscape and to dynamically scale momentum elements according to their signal-to-noise ratio. Empirically, this modification yields consistent performance gains over Adam and its variants across multiple learning paradigms and neural network architectures, including reinforcement learning and sequence modeling. Notably, on reinforcement learning benchmarks such as MuJoCo, our centered variant called “Adam+” achieves faster convergence and improved stability compared to Adam, which remains the gold standard in settings characterized by non-stationarity and the absence of reliable learning rate schedules.

Primary Area: optimization

Submission Number: 19708

Loading