How you start matters for generalization

Sameera Ramasinghe; Lachlan Ewen MacDonald; Moshiur R Farazi; Hemanth Saratchandran; Simon Lucey

How you start matters for generalization

Sameera Ramasinghe, Lachlan Ewen MacDonald, Moshiur R Farazi, Hemanth Saratchandran, Simon Lucey

22 Sept 2022 (modified: 13 Feb 2023)ICLR 2023 Conference Withdrawn SubmissionReaders: Everyone

Keywords: spectral bias, generalization

TL;DR: We promote a shift of focus towards initialization rather than neural architecture or (stochastic) gradient descent to explain this implicit regularization

Abstract: Characterizing the remarkable generalization properties of over-parameterized neural networks remains an open problem. A growing body of recent literature shows that the bias of stochastic gradient descent (SGD) and architecture choice implicitly leads to better generalization. In this paper, we show on the contrary that, independently of architecture, SGD can itself be the cause of poor generalization if one does not ensure good initialization. Specifically, we prove that any differentiably parameterized model, trained under gradient flow, obeys a weak spectral bias law which states that sufficiently high frequencies train arbitrarily slowly. This implies that very high frequencies present at initialization will remain after training, and hamper generalization. Further, we empirically test the developed theoretical insights using practical, deep networks. Finally, we contrast our framework with that supplied by the \emph{flat-minima} conjecture and show that Fourier analysis grants a more reliable framework for understanding the generalization of neural networks.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Submission Guidelines: Yes

Please Choose The Closest Area That Your Submission Falls Into: Deep Learning and representational learning

Supplementary Material: zip

4 Replies

Loading