Practical guidelines for resolving the loss divergence caused by the root-mean-squared propagation optimizer

Yuan-Long Peng, Wei-Po Lee

Published: 2024, Last Modified: 05 Jun 2025Appl. Soft Comput. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Highlights•We provide methods to detect loss divergence in the process of training neural network.•We explain the reason of loss divergence and countermeasures to deal with it in plain English.•The results were conducted both in computer vision and natural language processing datasets.•A brief survey of second order momentum optimizers.