The Effect of Network Width on Stochastic Gradient Descent and Generalization: an Empirical StudyDownload PDFOpen Website

2019 (modified: 11 Nov 2022)ICML 2019Readers: Everyone
Abstract: We investigate how the final parameters found by stochastic gradient descent are influenced by over-parameterization. We generate families of models by increasing the number of channels in a base n...
0 Replies

Loading