On the properties of variational approximations of Gibbs posteriors

Pierre Alquier, James Ridgway, Nicolas Chopin

31 Mar 2021OpenReview Archive Direct UploadReaders: Everyone

Abstract: The PAC-Bayesian approach is a powerful set of techniques to derive non-asymptoticrisk bounds for random estimators. The corresponding optimal distribution of estimators,usually called the Gibbs posterior, is unfortunately often intractable. One may samplefrom it using Markov chain Monte Carlo, but this is usually too slow for big datasets.We consider instead variational approximations of the Gibbs posterior, which are fastto compute. We undertake a general study of the properties of such approximations.Our main finding is that such a variational approximation has often the same rate ofconvergence as the original PAC-Bayesian procedure it approximates. In addition, weshow that, when the risk function is convex, a variational approximation can be obtainedin polynomial time using a convex solver. We give finite sample oracle inequalities for thecorresponding estimator. We specialize our results to several learning tasks (classification,ranking, matrix completion), discuss how to implement a variational approximation ineach case, and illustrate the good properties of said approximation on real datasets.

0 Replies