Improved Overparametrization Bounds for Global Convergence of SGD for Shallow Neural NetworksDownload PDFOpen Website

Published: 01 Jan 2023, Last Modified: 13 Jun 2023Trans. Mach. Learn. Res. 2023Readers: Everyone
Abstract: We study the overparametrization bounds required for the global convergence of stochastic gradient descent algorithm for a class of one hidden layer feed-forward neural networks equipped with ReLU activation function. We improve the existing state-of-the-art results in terms of the required hidden layer width. We introduce a new proof technique combining nonlinear analysis with properties of random initializations of the network.
0 Replies

Loading