Stochastic Gradient Discrete Langevin Dynamics

Haoran Sun; Bethany Yixin Wang; Katayoon Goshvadi; Yuan Xue; Dale Schuurmans; Hanjun Dai

Stochastic Gradient Discrete Langevin Dynamics

Haoran Sun, Bethany Yixin Wang, Katayoon Goshvadi, Yuan Xue, Dale Schuurmans, Hanjun Dai

22 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX

Primary Area: probabilistic methods (Bayesian methods, variational inference, sampling, UQ, etc.)

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Keywords: Stochastic Gradient, Langevin Dynamics, Discrete Langevin Dynamics, MCMC, Discrete Sampling

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.

TL;DR: We propose a stochastic gradient langevin dynamics to sample in discrete spaces.

Abstract: Sampling via Markov chain Monte Carlo can be inefficient when each evaluation of the gradient of energy function depends on a large dataset. In continuous spaces, this challenge has been addressed by extending Langevin samplers with stochastic gradient estimators. However, such an approach cannot be directly applied to discrete spaces, as a naive migration leads to biased estimation with large variance. To fill this gap, we propose a new sampling strategy, \emph{Stochastic Gradient Discrete Langevin Dynamics}, to provide the first practical method for stochastic distribution sampling in discrete spaces. Our approach mitigates the bias of naive ``gradient'' estimators via a novel caching scheme, and reduces the estimation variance by introducing a modified Polyak step size control for simulation time adaptation. We demonstrate significant efficiency improvements across various sampling problems in discrete spaces, including Bayesian learning, stochastic integer programming, and prompt tuning for text-image models.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 4373

Loading