AdamT: A Stochastic Optimization with Trend Correction Scheme

Bingxin Zhou; Xuebin Zheng; Junbin Gao

AdamT: A Stochastic Optimization with Trend Correction Scheme

Bingxin Zhou, Xuebin Zheng, Junbin Gao

25 Sept 2019 (modified: 22 Jun 2025)ICLR 2020 Conference Withdrawn SubmissionReaders: Everyone

Keywords: Optimization, ADAM, Stochastic Gradient Descent, Deep Learning

TL;DR: We present a new framework for adapting Adam-typed methods, namely AdamT, to include the trend information when updating the parameters with the adaptive step size and gradients.

Abstract: Adam-typed optimizers, as a class of adaptive moment estimation methods with the exponential moving average scheme, have been successfully used in many applications of deep learning. Such methods are appealing for capability on large-scale sparse datasets. On top of that, they are computationally efficient and insensitive to the hyper-parameter settings. In this paper, we present a new framework for adapting Adam-typed methods, namely AdamT. Instead of applying a simple exponential weighted average, AdamT also includes the trend information when updating the parameters with the adaptive step size and gradients. The newly added term is expected to efficiently capture the non-horizontal moving patterns on the cost surface, and thus converge more rapidly. We show empirically the importance of the trend component, where AdamT outperforms the conventional Adam method constantly in both convex and non-convex settings.

Code: https://github.com/xuebin-zh/AdamT

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/adamt-a-stochastic-optimization-with-trend/code)

Original Pdf: pdf

4 Replies

Loading