On the Noisy Gradient Descent that Generalizes as SGDDownload PDFOpen Website

Published: 01 Jan 2020, Last Modified: 09 May 2023ICML 2020Readers: Everyone
Abstract: The gradient noise of SGD is considered to play a central role in the observed strong generalization abilities of deep learning. While past studies confirm that the magnitude and the covariance str...
0 Replies

Loading