Implicit Regularization Leads to Benign Overfitting for Sparse Linear Regression

Published: 24 Apr 2023, Last Modified: 15 Jun 2023ICML 2023 PosterEveryoneRevisions
Abstract: In deep learning, often the training process finds an interpolator (a solution with 0 training loss), but the test loss is still low. This phenomenon, known as *benign overfitting*, is a major mystery that received a lot of recent attention. One common mechanism for benign overfitting is *implicit regularization*, where the training process leads to additional properties for the interpolator, often characterized by minimizing certain norms. However, even for a simple sparse linear regression problem $y = \beta^{\ast\top} x +\xi$ with sparse $\beta^{\ast}$, neither minimum $\ell_1$ or $\ell_2$ norm interpolator gives the optimal test loss. In this work, we give a different parametrization of the model which leads to a new implicit regularization effect that combines the benefit of $\ell_1$ and $\ell_2$ interpolators. We show that training our new model via gradient descent leads to an interpolator with near-optimal test loss. Our result is based on careful analysis of the training dynamics and provides another example of implicit regularization effect that goes beyond norm minimization.
Submission Number: 1964
Loading