On the Generalization of Neural Networks Trained with SGD: Information-Theoretical Bounds and Implications
Keywords: deep learning, generalization, information theory, learning bound, regularization
Abstract: Understanding the generalization behaviour of deep neural networks is an important theme of modern research in machine learning. In this paper, we follow up on a recent work of (Neu, 2021) and present new information-theoretic upper bounds for the generalization error of neural networks trained with SGD. Our bounds and experimental study provide new insights on the SGD training of neural networks. They also point to a new and simple regularization scheme which we show performs comparably to the current state of the art.
Code Of Conduct: I certify that all co-authors of this work have read and commit to adhering to the NeurIPS Statement on Ethics, Fairness, Inclusivity, and Code of Conduct.
TL;DR: We derived new information-theoretic generalization bounds for SGD and we also proposed a new regularization scheme.
Supplementary Material: zip
10 Replies
Loading