Policy Gradients for Optimal Parallel Tempering MCMC

Daniel Zhao; Natesh S. Pillai

Policy Gradients for Optimal Parallel Tempering MCMC

Daniel Zhao, Natesh S. Pillai

Published: 17 Jun 2024, Last Modified: 20 Jul 20242nd SPIGM @ ICML PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: reinforcement learning, MCMC, policy gradient, Parallel tempering, ICML

TL;DR: We made a flexible reinforcement learning approach for optimizing parallel tempering MCMC that outperforms SotA for some distributions.

Abstract: Parallel tempering is a meta-algorithm for Markov Chain Monte Carlo (MCMC) methods which uses multiple chains to sample from tempered versions of the target distribution, improving mixing on multi-modal distributions that are difficult to explore for traditional methods. The success of this technique depends critically on the choice of chain temperatures. We introduce an adaptive temperature selection algorithm which adjusts temperatures during sampling using a policy gradient method. Experimental results show that it can outperform traditional geometrically-spaced temperatures and uniform acceptance rate temperature ladders in terms of integrated autocorrelation time on test distributions.

Submission Number: 134

Loading