Adaptive Exponential Decay Rates for Adam

Weidong Zou; Yuanqing Xia; Weipeng Cao; Bineng Zhong

Adaptive Exponential Decay Rates for Adam

Weidong Zou, Yuanqing Xia, Weipeng Cao, Bineng Zhong

17 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Optimization method, deep neural networks, Adam and its variants

Abstract: Adam and its variants, including AdaBound, AdamW, and AdaBelief, have gained widespread popularity for enhancing the learning speed and generalization performance of deep neural networks. This optimization technique adjusts weight vectors by utilizing predetermined exponential decay rates (i.e.,$\beta_1$ = 0.9, $\beta_2$ = 0.999) based on the first moment estimate and the second raw moment estimate of the gradient. However, the default exponential decay rates might not be optimal, and the process of tuning them through trial and error with experience proves to be time-consuming. In this paper, we introduce AdamE, a novel variant of Adam designed to automatically leverage dynamic exponential decay rates on the first moment estimate and the second raw moment estimate of the gradient. Additionally, we provide theoretical proof of the convergence of AdamE in both convex and non-convex cases. To validate our claims, we perform experiments across various neural network architectures and tasks. Comparative analyses with adaptive methods utilizing default exponential decay rates reveal that AdamE consistently achieves rapid convergence and high accuracy in language modeling, node classification, and graph clustering tasks.

Primary Area: optimization

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Supplementary Material: pdf

Submission Number: 1228

Loading