Generalization Bounds via Meta-Learned Model Representations: PAC-Bayes and Sample Compression Hypernetworks

Benjamin Leblanc; Mathieu Bazinet; Nathaniel D'Amours; Alexandre Drouin; Pascal Germain

Generalization Bounds via Meta-Learned Model Representations: PAC-Bayes and Sample Compression Hypernetworks

Benjamin Leblanc, Mathieu Bazinet, Nathaniel D'Amours, Alexandre Drouin, Pascal Germain

Published: 01 May 2025, Last Modified: 23 Jul 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

TL;DR: We investigate hypernetwork architectures for deriving generalization bounds in meta-learning by leveraging the PAC-Bayesian framework and the Sample Compression theory

Abstract: Both PAC-Bayesian and Sample Compress learning frameworks have been shown instrumental for deriving tight (non-vacuous) generalization bounds for neural networks. We leverage these results in a meta-learning scheme, relying on a hypernetwork that outputs the parameters of a downstream predictor from a dataset input. The originality of our approach lies in the investigated hypernetwork architectures that encode the dataset before decoding the parameters: (1) a PAC-Bayesian encoder that expresses a posterior distribution over a latent space, (2) a Sample Compress encoder that selects a small sample of the dataset input along with a message from a discrete set, and (3) a hybrid between both approaches motivated by a new Sample Compress theorem handling continuous messages. The latter theorem exploits the pivotal information transiting at the encoder-decoder junction in order to compute generalization guarantees for each downstream predictor obtained by our meta-learning scheme.

Lay Summary: Since the beginning of machine learning research, statistical learning theorists have proposed ways of guaranteeing that artificial intelligence systems will "behave well" on tasks they haven't seen yet. These kinds of certifications are challenging to obtain for modern deep learning architectures, which we can see as a sophisticated arrangement of many building blocks. When one "asks" a neural network to perform a given task, its architecture defines how the information flows in the building blocks before providing an "answer." We propose a new way to design deep learning architectures to embody at their core ideas stemming from existing learning theories. This is achieved in a "meta-learning" context, where knowledge is gathered on multiple tasks and then leveraged to tackle new problems. In summary, we introduce an original strategy to bridge theoretical guarantees and meta-learning: When asked to solve a new problem, our neural network provides certifications along with the answered solution.

Link To Code: https://github.com/GRAAL-Research/DeepRM

Primary Area: General Machine Learning->Transfer, Multitask and Meta-learning

Keywords: Meta-learning, Pac-Bayes, Sample Compression, Hypernetworks

Submission Number: 1383

Loading