## Implicit Bias of MSE Gradient Optimization in Underparameterized Neural Networks

29 Sept 2021, 00:35 (edited 14 Mar 2022)ICLR 2022 PosterReaders: Everyone
• Keywords: underparameterized regime, spectral bias, neural tangent kernel, implicit bias, implicit regularization, gradient flow
• Abstract: We study the dynamics of a neural network in function space when optimizing the mean squared error via gradient flow. We show that in the underparameterized regime the network learns eigenfunctions of an integral operator \$T_K\$ determined by the Neural Tangent Kernel at rates corresponding to their eigenvalues. For example, for uniformly distributed data on the sphere \$S^{d - 1}\$ and rotation invariant weight distributions, the eigenfunctions of \$T_K\$ are the spherical harmonics. Our results can be understood as describing a spectral bias in the underparameterized regime. The proofs use the concept of ``Damped Deviations'' where deviations of the NTK matter less for eigendirections with large eigenvalues. Aside from the underparameterized regime, the damped deviations point-of-view allows us to extend certain results in the literature in the overparameterized setting.
• One-sentence Summary: Underparameterized networks optimizing MSE learn eigenfunctions of an NTK integral operator at rates corresponding to their eigenvalues.
24 Replies