Uniform convergence may be unable to explain generalization in deep learningDownload PDF

Vaishnavh Nagarajan, Zico Kolter

06 Sept 2019 (modified: 05 May 2023)NeurIPS 2019Readers: Everyone
Abstract: We cast doubt on the power of uniform convergence-based generalization bounds to provide a complete picture of why overparameterized deep networks generalize well. While it is well-known that many existing uniform convergence-based bounds are numerically large, through a variety of experiments, we first bring to light another crucial and more concerning aspect of these bounds: in practice, these bounds can {\em increase} with the dataset size. Guided by our observations, we then present examples of overparameterized linear classifiers and neural networks trained by gradient descent (GD) where uniform convergence provably cannot ``explain generalization,'' even if we take into account implicit regularization by GD {\em to the fullest extent possible}. More precisely, even if we consider only the set of classifiers output by GD that have test errors less than some small $\epsilon$, applying (two-sided) uniform convergence on this set of classifiers yields a generalization guarantee that is larger than $1-\epsilon$ and is therefore nearly vacuous.
Code Link: https://github.com/locuslab/uniform-convergence-NeurIPS19
CMT Num: 6214
0 Replies

Loading