Towards Simple and Provable Parameter-Free Adaptive Gradient Methods

Yuanzhe Tao; Huizhuo Yuan; Yifeng Liu; zhou Xun; Yuan Cao; Quanquan Gu

Towards Simple and Provable Parameter-Free Adaptive Gradient Methods

Yuanzhe Tao, Huizhuo Yuan, Yifeng Liu, zhou Xun, Yuan Cao, Quanquan Gu

19 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Optimization

Abstract: Optimization algorithms such as AdaGrad and Adam have significantly advanced the training of deep models by dynamically adjusting the learning rate during the optimization process. However, ad-hoc tuning of learning rates poses a challenge, leading to inefficiencies in practice. To address this issue, recent research has focused on developing ``parameter-free'' algorithms that operate effectively without the need for learning rate tuning. Despite these efforts, existing parameter-free variants of AdaGrad and Adam tend to be overly complex and/or lack formal convergence guarantees. In this paper, we present AdaGrad++ and Adam++, novel and simple parameter-free variants of AdaGrad and Adam with convergence guarantees. We prove that AdaGrad++ achieves comparable convergence rates to AdaGrad in convex optimization without predefined learning rate assumptions. Similarly, Adam++ matches the convergence rate of Adam without relying on any conditions on the learning rates. Experimental results across various deep learning tasks validate the competitive performance of Adam++.

Primary Area: optimization

Submission Number: 16128

Loading