Abstract: Gradient descent-based algorithms are crucial in neural network optimization, and most of them only depend on local properties such as the first and second-order momentum of gradients to determine the local optimization directions. As a result, such algorithms often converge slowly in the case of a small gradient and easily fall into the local optimum. Since the goal of optimization is to minimize the loss function, the status of the loss indicates the overall progress of the optimization but has not been fully explored. In this paper, we propose a loss-aware gradient adjusting strategy (LGA) based on the loss status. LGA automatically adjusts the update magnitude of parameters to accelerate convergence and escape local optimums by introducing a loss-incentive correction term monitoring the loss and adapting gradient experience. The proposed strategy can be applied to various gradient descent-based optimization algorithms. We provide theoretical analysis on the convergence rate and empirical evaluations on different datasets to demonstrate the effectiveness of our method.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Submission Guidelines: Yes
Please Choose The Closest Area That Your Submission Falls Into: Optimization (eg, convex and non-convex optimization)
10 Replies
Loading