Adaptive Inertia: Disentangling the Effects of Adaptive Learning Rate and MomentumDownload PDFOpen Website

2022 (modified: 14 Nov 2022)ICML 2022Readers: Everyone
Abstract: Adaptive Moment Estimation (Adam), which combines Adaptive Learning Rate and Momentum, would be the most popular stochastic optimizer for accelerating the training of deep neural networks. However,...
0 Replies

Loading