A Practical PAC-Bayes Generalisation Bound for Deep Learning

Diego Granziol; Mingtian Zhang; Nicholas Baskerville

A Practical PAC-Bayes Generalisation Bound for Deep Learning

Diego Granziol, Mingtian Zhang, Nicholas Baskerville

29 Sept 2021 (modified: 13 Feb 2023)ICLR 2022 Conference Withdrawn SubmissionReaders: Everyone

Keywords: generalisation, hessian, pac-bayesian

Abstract: Under a PAC-Bayesian framework, we derive an implementation efficient parameterisation invariant metric to measure the difference between our true and empirical risk. We show that for solutions of low training loss, this metric can be approximated at the same cost as a single step of SGD. We investigate the usefulness of this metric on pathological examples, where traditional Hessian based sharpness metrics increase but generalisation also increases and find good experimental agreement. As a consequence of our PAC-Bayesian framework and theoretical arguments on the effect of sub-sampling the Hessian, we include a trace of Hessian term into our structural risk. We find that this term promotes generalisation on a variety of experiments using Wide-Residual Networks on the CIFAR datasets.

One-sentence Summary: Practical generalisation measure for deep neural networks and hessian based regularisation term to increase regularisation.

Supplementary Material: zip

4 Replies

Loading