Keywords: normalized optimizers, rank-one geometry, nonsmooth optimization, momentum methods, Lion, Muon, time-dependent potentials
Abstract: Recent normalized optimizers such as Lion and Muon highlight the importance of
geometry in modern optimizer design. We propose a unified framework that extends
the Lion-K perspective to a broader class of normalized update rules by
representing the momentum variable in an orthonormal rank-one system and
defining an $\ell_1$-type coefficient potential, thereby covering SGD, Lion,
and Muon within a single geometric view. We further study a regime with explicit
time-dependent potentials, which is not covered by the static formulation, and
show empirically that the resulting optimizer remains stable at ImageNet scale.
On ViT-Base trained on ImageNet-1k, the proposed variant converges reliably and
outperforms AdamW in our comparison, suggesting a route toward more systematic
normalized-optimizer design.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 164
Loading