The Anisotropic Noise in Stochastic Gradient Descent: Its Behavior of Escaping from Sharp Minima and Regularization EffectsDownload PDFOpen Website

2019 (modified: 16 Apr 2023)ICML 2019Readers: Everyone
Abstract: Understanding the behavior of stochastic gradient descent (SGD) in the context of deep neural networks has raised lots of concerns recently. Along this line, we study a general form of gradient bas...
0 Replies

Loading