VC Theoretical Explanation of Double DescentDownload PDF

16 May 2022 (modified: 05 May 2023)NeurIPS 2022 SubmittedReaders: Everyone
Abstract: There has been growing interest in generalization performance of large multilayer neural networks, that can be trained to achieve zero training error, and yet they generalize well on test data. This regime is known as ‘second descent’ and it appears to contradict conventional view that optimal model complexity should reflect optimal balance between underfitting and overfitting, aka bias-variance trade-off. This paper presents VC-theoretical analysis of double descent and shows that it can be fully explained by classical VC generalization bounds. We illustrate application of analytic VC-bounds to modeling double descent for classification problems, using empirical results for several learning methods, such as SVM, Least Squares, and Multilayer Perceptron classifiers. In addition, we discuss several possible reasons for misunderstanding of VC-theoretical results in machine learning community.
Supplementary Material: pdf
8 Replies

Loading