Keywords: hyperparameter tuning, learning rate gradient, automatic learning rate selection
TL;DR: We introduce an automatic method for selecting the initial learning rate by leveraging the gradient of the learning rate itself.
Abstract: Selecting an optimal learning rate (LR) is crucial for training deep neural networks, significantly affecting both convergence speed and final model performance. Determining this optimal LR typically involves two key challenges: choosing an appropriate initial LR and selecting an LR scheduler for adjusting the LR during training. This paper focuses on the former challenge—selecting the initial LR. Traditionally, this task relies on manual tuning or heuristic methods, often involving extensive trial-and-error or computationally expensive search strategies like grid search or random search. We propose an algorithm, Automatic Learning Rate Selection (ALRS), to find the initial LR without the need for manual intervention. ALRS leverages the gradient of the LR itself — a less explored approach in the field. ALRS is a computationally lightweight pre-training process that automatically selects the initial LR by iterative refinements using the LR gradient, specifically analyzing its sign information, combined with suitable search algorithms. This approach efficiently converges to the optimal LR in a stable and robust manner across various optimizers and network architectures.
We evaluate our technique on standard deep learning benchmarks, including MNIST with a CNN and CIFAR-10 and CIFAR-100 with ResNet-18, using both SGD and Adam optimizers. Our experiments demonstrate that the automatically determined LRs achieve performance comparable to manually tuned LRs and state-of-the-art results.
Primary Area: optimization
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 9162
Loading