- Abstract: There exist several stochastic optimization algorithms. However in most cases, it is difficult to tell for a particular problem which will be the best optimizer to choose as each of them are good. Thus, we present a simple and intuitive technique, when applied to first order optimization algorithms, is able to improve the speed of convergence and reaches a better minimum for the loss function compared to the original algorithms. The proposed solution modifies the update rule, based on the variation of the direction of the gradient during training. We conducted several tests with Adam and AMSGrad on two different datasets. The preliminary results show that the proposed technique improves the performance of existing optimization algorithms and works well in practice.
- Keywords: Optimization, Optimizer, Adam, Gradient Descent