Keywords: reinforcement learning, MCMC, policy gradient, Parallel tempering, ICML
TL;DR: We made a flexible reinforcement learning approach for optimizing parallel tempering MCMC that outperforms SotA for some distributions.
Abstract: Parallel tempering is a meta-algorithm for Markov Chain Monte Carlo (MCMC) methods which uses multiple chains to sample from tempered versions of the target distribution, improving mixing on multi-modal distributions that are difficult to explore for traditional methods. The success of this technique depends critically on the choice of chain temperatures. We introduce an adaptive temperature selection algorithm which adjusts temperatures during sampling using a policy gradient method. Experimental results show that it can outperform traditional geometrically-spaced temperatures and uniform acceptance rate temperature ladders in terms of integrated autocorrelation time on test distributions.
Submission Number: 134
Loading