Keywords: mirror descent, dual dynamics, smoothed sign descent
Abstract: The optimization dynamics of gradient descent for overparameterized problems can be viewed as low-dimensional dual dynamics induced by a mirror map, providing a mirror descent perspective on the implicit regularization phenomenon. However, the dynamics of adaptive gradient descent methods that are widely used in practice remain less understood. Meanwhile, empirical evidence of performance gaps suggests fundamental differences in their underlying dynamics. In this work, we introduce the dual dynamics of smoothed sign descent with stability constant $\varepsilon$ for regression problems, formulated using the mirror descent framework. Unlike prior methods, our approach applies to algorithms where update directions deviate from true gradients such as ADAM. We propose a mirror map that reveals the equivalent dual dynamics under some assumptions. By studying dual dynamics, we characterize the convergent solution as approximately minimizing a Bregman divergence style function closely related to the $l_{3/2}$ norm. Furthermore, we demonstrate the role of the stability constant $\varepsilon$ in shaping the convergent solution. Our analyses offer new insights into the distinct properties of the smoothed sign descent algorithm, and show the potential of applying the mirror descent framework to study complex dynamics beyond gradient descent.
Latex Source Code: zip
Signed PMLR Licence Agreement: pdf
Readers: auai.org/UAI/2025/Conference, auai.org/UAI/2025/Conference/Area_Chairs, auai.org/UAI/2025/Conference/Reviewers, auai.org/UAI/2025/Conference/Submission162/Authors, auai.org/UAI/2025/Conference/Submission162/Reproducibility_Reviewers
Submission Number: 162
Loading