Adaptive Gradient Methods with Local Guarantees

Zhou Lu; Wenhan Xia; Sanjeev Arora; Elad Hazan

Adaptive Gradient Methods with Local Guarantees

Zhou Lu, Wenhan Xia, Sanjeev Arora, Elad Hazan

Published: 16 May 2022, Last Modified: 03 Nov 2024AutoML 2022 (Late-Breaking Workshop)Readers: Everyone

Abstract: Adaptive gradient methods are the method of choice for optimization in machine learning and used to train the largest deep models. In this paper we study the problem of learning a local preconditioner, that can change as the data is changing along the optimization trajectory. We propose an adaptive gradient method that has provable adaptive regret guarantees vs. the best local preconditioner. To derive this guarantee, we prove a new adaptive regret bound in online learning that improves upon previous adaptive online learning methods. We demonstrate the robustness of our method in automatically choosing the optimal learning rate schedule for popular benchmarking tasks in vision and language domains. Without the need to manually tune a learning rate schedule, our method can, in a single run, achieve comparable and stable task accuracy as a fine-tuned optimizer.

Keywords: Online Convex Optimization, Adaptive Regret, Learning Rate Schedule

One-sentence Summary: We derive an online algorithm with optimal strongly adaptive regret, and use it in automatically selecting learning rates in optimization.

Reproducibility Checklist: Yes

Broader Impact Statement: Yes

Paper Availability And License: Yes

Code Of Conduct: Yes

Reviewers: Zhou Lu, zhoul@princeton.edu

Main Paper And Supplementary Material: pdf

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/adaptive-gradient-methods-with-local/code)

2 Replies

Loading