NAMSG: An Efficient Method for Training Neural Networks

Yushu Chen; Hao Jing; Wenlai Zhao; Zhiqiang Liu; Ouyi Li; Liang Qiao; Haohuan Fu; Wei Xue; Guangwen Yang

NAMSG: An Efficient Method for Training Neural Networks

Yushu Chen, Hao Jing, Wenlai Zhao, Zhiqiang Liu, Ouyi Li, Liang Qiao, Haohuan Fu, Wei Xue, Guangwen Yang

25 Sept 2019 (modified: 22 Jun 2025)ICLR 2020 Conference Withdrawn SubmissionReaders: Everyone

TL;DR: A new algorithm for training neural networks that compares favorably to popular adaptive methods.

Abstract: We introduce NAMSG, an adaptive first-order algorithm for training neural networks. The method is efficient in computation and memory, and is straightforward to implement. It computes the gradients at configurable remote observation points, in order to expedite the convergence by adjusting the step size for directions with different curvatures in the stochastic setting. It also scales the updating vector elementwise by a nonincreasing preconditioner to take the advantages of AMSGRAD. We analyze the convergence properties for both convex and nonconvex problems by modeling the training process as a dynamic system, and provide a strategy to select the observation factor without grid search. A data-dependent regret bound is proposed to guarantee the convergence in the convex setting. The method can further achieve a O(log(T)) regret bound for strongly convex functions. Experiments demonstrate that NAMSG works well in practical problems and compares favorably to popular adaptive methods, such as ADAM, NADAM, and AMSGRAD.

Code: https://github.com/rationalspark/NAMSG/blob/master/Namsg.py

Keywords: neural networks, training, adaptive methods

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 2 code implementations](https://www.catalyzex.com/paper/namsg-an-efficient-method-for-training-neural/code)

Original Pdf: pdf

11 Replies

Loading