The Limits of large learning rates: A Case Study in Single Index Models

Published: 22 Sept 2025, Last Modified: 01 Dec 2025NeurIPS 2025 WorkshopEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Large learning rates, Edge of stability, Single-index models, Central flow, Gradient descent dynamics, Feature learning, Optimization in neural networks
Abstract: Gradient descent methods with large learning rates has recently been shown to improve generalization in deep networks by enhancing feature learning and acting as an implicit regularizer. In this work, we present a contrasting case study in structured nonlinear models, focusing on the single-index and multi-index settings. Using the central flow framework, we analyze training dynamics in the Edge of Stability (EoS) regime, where iterates oscillate around sharpness thresholds. Our analysis reveals that in the single-index model, the loss and sharpness gradients are colinear, therefore, the central flow projects away the only valid descent direction, leading to stalled optimization. Numerical experiments confirm that SGD with large learning rates halts learning in this setting. These results highlight fundamental limitations of large learning rates in structured models, refining our understanding of EoS dynamics and feature learning.
Submission Number: 144
Loading