Provable Tempered Overfitting of Minimal Nets and Typical Nets

Itamar Harel; William M. Hoza; Gal Vardi; Itay Evron; Nathan Srebro; Daniel Soudry

Provable Tempered Overfitting of Minimal Nets and Typical Nets

Itamar Harel, William M. Hoza, Gal Vardi, Itay Evron, Nathan Srebro, Daniel Soudry

Published: 25 Sept 2024, Last Modified: 06 Nov 2024NeurIPS 2024 posterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Deep Learning, Tempered Overfitting, Generalization

TL;DR: We prove that fully connected neural networks with quantized weights exhibit tempered overfitting when using both the smallest interpolating NN and a random interpolating NN.

Abstract: We study the overfitting behavior of fully connected deep Neural Networks (NNs) with binary weights fitted to perfectly classify a noisy training set. We consider interpolation using both the smallest NN (having the minimal number of weights) and a random interpolating NN. For both learning rules, we prove overfitting is tempered. Our analysis rests on a new bound on the size of a threshold circuit consistent with a partial function. To the best of our knowledge, ours are the first theoretical results on benign or tempered overfitting that: (1) apply to deep NNs, and (2) do not require a very high or very low input dimension.

Primary Area: Learning theory

Submission Number: 20613

Loading