Information-Theoretic Generalization Bounds for Deep Neural Networks

Published: 27 Oct 2023, Last Modified: 14 Dec 2023InfoCog@NeurIPS2023 OralEveryoneRevisionsBibTeX
Keywords: information theory, generalization error, deep neural network, internal representation
Abstract: Deep neural networks (DNNs) exhibit an exceptional capacity for generalization in practical applications. This work aims to capture the effect and benefits of depth for learning within the paradigm of information-theoretic generalization bounds. We derive two novel hierarchical bounds on the generalization error that capture the effect of the internal representations within each layer. The first bound demonstrates that the generalization bound shrinks as the layer index of the internal representation increases. The second bound aims to quantify the contraction of the relevant information measures when moving deeper into the network. To achieve this, we leverage the strong data processing inequality (SDPI) and employ a stochastic approximation of the DNN model we can explicitly control the SDPI coefficient. These results provide a new perspective for understanding generalization in deep models.
Submission Number: 12
Loading