Atlas – Rethinking Optimizer Design for Stability and Speed

Published: 22 Sept 2025, Last Modified: 02 Dec 2025NeurIPS 2025 WorkshopEveryoneRevisionsBibTeXCC BY 4.0
Keywords: optimization algorithms; second-order methods; curvature approximation; trust-region methods; deep learning; training stability
TL;DR: Atlas is a lightweight, curvature-aware optimizer that combines Hutch++ sketches, trust-radius control, and safe-step rollback to deliver second-order training stability and accuracy at first-order cost.
Abstract: Training modern neural networks still relies overwhelmingly on first-order optimisation, despite decades of evidence that second-order information can accelerate convergence and improve final generalisation. The practical barrier is cost: exact curvature is infeasible for large models, and most quasi-second-order methods bleed memory or wall-clock until they fall behind ADAM, let alone SGD. We introduce Atlas, a curvature-aware optimiser that stays small: (i) a Hutch++ low-rank sketch extracts promising curvature directions in O(kd) memory, (ii) a trust-radius clamp prevents runaway steps without tuning, and (iii) a lightweight Safe-Step Control rolls back the rare catastrophic update. On five image-classification benchmarks (MNIST, FASHION-MNIST, SVHN, CIFAR10, CIFAR100) and identical micro-CNNs, Atlas achieves the highest test accuracy on all five tasks, beating the strongest baseline by up to 2.54 % points and the macro mean by 2.4 pp. At the same time it reduces rollback events by an order of magnitude. Atlas therefore delivers second-order quality at first-order cost.
Submission Number: 6
Loading