On the Convergence Direction of Gradient Descent

Shuo Chen; Xiaolong Li; Jiaying Peng; Yao Zhao

On the Convergence Direction of Gradient Descent

Shuo Chen, Xiaolong Li, Jiaying Peng, Yao Zhao

Published: 26 Jan 2026, Last Modified: 11 Apr 2026ICLR 2026 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Gradient Descent, Edge of Stability, Convergence Direction

Abstract: Gradient descent (GD) is a fundamental optimization method in deep learning, yet its asymptotic directional properties remain less understood. In this paper, we prove that if GD converges, its trajectory either aligns toward a fixed direction or oscillates along a specific line. The fixed-direction convergence occurs under small learning rates, while the oscillatory convergence behavior emerges for large learning rates. This result offers a new lens for understanding long-term GD dynamics. Experimentally, we find that this directional convergence behavior also appears in stochastic gradient descent (SGD) and Adam. Furthermore, we discuss how these theoretical findings regarding oscillatory convergence might offer a perspective on the sharpness dynamics observed in the Edge of Stability (EoS) regime. Our work provides both theoretical clarity and practical insight into the behavior of dynamics for multiple optimization methods.

Supplementary Material: zip

Primary Area: optimization

Submission Number: 8584

Loading