Keywords: PAC-Bayes bound, MAC-Bayes bound, KL divergence, block-sample MAC-Bayes bound
TL;DR: We propose a novel PAC-Bayes generalization bound for learning algorithms for that holds in expectation and show that it does not hold with high probability.
Abstract: We present a family of novel block-sample MAC-Bayes bounds (mean approximately correct). While PAC-Bayes bounds (probably approximately correct) typically give bounds for the generalization error that hold with high probability, MAC-Bayes bounds have a similar form but bound the expected generalization error instead. The family of bounds we propose can be understood as a unification of an expectation version of known PAC-Bayes bounds and the individual-sample information-theoretic bounds. Compared to standard PAC-Bayes bounds, the new bounds contain divergence terms that only depend on subsets (or blocks) of the training data. We show that the tightness of these bounds depends on the choice of the block size as well as the comparator function, and that depending on the learning scenario, different choices of these two parameters can be optimal. Furthermore, we explore the question whether high-probability versions of our MAC-Bayes bounds (i.e., PAC-Bayes bounds of a similar form) are possible. We answer this question in the negative with an example that shows that in general, it is not possible to establish a PAC-Bayes bound which (a) vanishes with a rate faster than $\mathcal{O}(1/log~n)$ whenever the proposed MAC-Bayes bound vanishes with rate $\mathcal{O}(n^{-1/2})$ and (b) exhibits a logarithmic dependence on the permitted error probability.
Primary Area: learning theory
Submission Number: 14930
Loading