Compressed Sensing and Overparametrized Networks: Overfitting Peaks in a Model of Misparametrized Sparse Regression in the Interpolation LimitDownload PDF

Published: 21 Oct 2019, Last Modified: 05 May 2023NeurIPS 2019 Deep Inverse Workshop PosterReaders: Everyone
TL;DR: Proposes an analytically tractable model and inference procedure (misparametrized sparse regression, inferred using L_1 penalty and studied in the data-interpolation limit) to study deep-net related phenomena in the context of inverse problems.
Keywords: compressed sensing, overfitting, perfect fitting, double descent, thermodynamic limit, interpolation
Abstract: Current practice in machine learning is to employ deep nets in an overparametrized limit, with the nominal number of parameters typically exceeding the number of measurements. This resembles the situation in compressed sensing, or in sparse regression with $l_1$ penalty terms, and provides a theoretical avenue for understanding phenomena that arise in the context of deep nets. One such phenonemon is the success of deep nets in providing good generalization in an interpolating regime with zero training error. Traditional statistical practice calls for regularization or smoothing to prevent "overfitting" (poor generalization performance). However, recent work shows that there exist data interpolation procedures which are statistically consistent and provide good generalization performance\cite{belkin2018overfitting} ("perfect fitting"). In this context, it has been suggested that "classical" and "modern" regimes for machine learning are separated by a peak in the generalization error ("risk") curve, a phenomenon dubbed "double descent"\cite{belkin2019reconciling}. While such overfitting peaks do exist and arise from ill-conditioned design matrices, here we challenge the interpretation of the overfitting peak as demarcating the regime where good generalization occurs under overparametrization. We propose a model of Misparamatrized Sparse Regression (MiSpaR) and analytically compute the GE curves for $l_2$ and $l_1$ penalties. We show that the overfitting peak arising in the interpolation limit is dissociated from the regime of good generalization. The analytical expressions are obtained in the so called "thermodynamic" limit. We find an additional interesting phenomenon: increasing overparametrization in the fitting model increases sparsity, which should intuitively improve performance of $l_1$ penalized regression. However, at the same time, the relative number of measurements decrease compared to the number of fitting parameters, and eventually overparametrization does lead to poor generalization. Nevertheless, $l_1$ penalized regression can show good generalization performance under conditions of data interpolation even with a large amount of overparametrization. These results provide a theoretical avenue into studying inverse problems in the interpolating regime using overparametrized fitting functions such as deep nets.
1 Reply

Loading