On the Generalization of Neural Networks Trained with SGD: Information-Theoretical Bounds and Implications

Ziqiao Wang; Yongyi Mao

On the Generalization of Neural Networks Trained with SGD: Information-Theoretical Bounds and Implications

Ziqiao Wang, Yongyi Mao

21 May 2021 (modified: 05 May 2023)NeurIPS 2021 SubmittedReaders: Everyone

Keywords: deep learning, generalization, information theory, learning bound, regularization

Abstract: Understanding the generalization behaviour of deep neural networks is an important theme of modern research in machine learning. In this paper, we follow up on a recent work of (Neu, 2021) and present new information-theoretic upper bounds for the generalization error of neural networks trained with SGD. Our bounds and experimental study provide new insights on the SGD training of neural networks. They also point to a new and simple regularization scheme which we show performs comparably to the current state of the art.

Code Of Conduct: I certify that all co-authors of this work have read and commit to adhering to the NeurIPS Statement on Ethics, Fairness, Inclusivity, and Code of Conduct.

TL;DR: We derived new information-theoretic generalization bounds for SGD and we also proposed a new regularization scheme.

Supplementary Material: zip

10 Replies

Loading