The role of over-parametrization in generalization of neural networks

Behnam Neyshabur; Zhiyuan Li; Srinadh Bhojanapalli; Yann LeCun; Nathan Srebro

The role of over-parametrization in generalization of neural networks

Behnam Neyshabur, Zhiyuan Li, Srinadh Bhojanapalli, Yann LeCun, Nathan Srebro

Published: 21 Dec 2018, Last Modified: 05 May 2023ICLR 2019 Conference Blind SubmissionReaders: Everyone

Abstract: Despite existing work on ensuring generalization of neural networks in terms of scale sensitive complexity measures, such as norms, margin and sharpness, these complexity measures do not offer an explanation of why neural networks generalize better with over-parametrization. In this work we suggest a novel complexity measure based on unit-wise capacities resulting in a tighter generalization bound for two layer ReLU networks. Our capacity bound correlates with the behavior of test error with increasing network sizes (within the range reported in the experiments), and could partly explain the improvement in generalization with over-parametrization. We further present a matching lower bound for the Rademacher complexity that improves over previous capacity lower bounds for neural networks.

Keywords: Generalization, Over-Parametrization, Neural Networks, Deep Learning

TL;DR: We suggest a generalization bound that could partly explain the improvement in generalization with over-parametrization.

Code: [![github](/images/github_icon.svg) bneyshabur/over-parametrization](https://github.com/bneyshabur/over-parametrization)

9 Replies

Loading