Time-independent Generalization Bounds for SGLD in Non-convex SettingsDownload PDF

21 May 2021, 20:48 (modified: 20 Dec 2021, 19:30)NeurIPS 2021 PosterReaders: Everyone
Keywords: SGLD, Langevin, stochastic gradient, generalization, stability, non-convex, wasserstein, optimization
TL;DR: In the setting of non-convex learning, we derive generalization error bounds for SGLD that are time-independent and decay to zero as the sample size increases
Abstract: We establish generalization error bounds for stochastic gradient Langevin dynamics (SGLD) with constant learning rate under the assumptions of dissipativity and smoothness, a setting that has received increased attention in the sampling/optimization literature. Unlike existing bounds for SGLD in non-convex settings, ours are time-independent and decay to zero as the sample size increases. Using the framework of uniform stability, we establish time-independent bounds by exploiting the Wasserstein contraction property of the Langevin diffusion, which also allows us to circumvent the need to bound gradients using Lipschitz-like assumptions. Our analysis also supports variants of SGLD that use different discretization methods, incorporate Euclidean projections, or use non-isotropic noise.
Supplementary Material: pdf
Code Of Conduct: I certify that all co-authors of this work have read and commit to adhering to the NeurIPS Statement on Ethics, Fairness, Inclusivity, and Code of Conduct.
15 Replies

Loading