Published: 01 Jan 2020, Last Modified: 09 May 2023ICML 2020Readers: Everyone
Abstract:The gradient noise of SGD is considered to play a central role in the observed strong generalization abilities of deep learning. While past studies confirm that the magnitude and the covariance str...