Adaptive Strategy for Resetting a Non-stationary Markov Chain during Learning via Joint Stochastic Approximation
Keywords: Model Learning, Deep Generative Model, Monte-Carlo Approximation, Markov Chain, Joint Stochastic Optimization
Abstract: In this paper, we tackle the non-stationary kernel problem of the JSA algorithm by Ou and Song 2020, a recent proposal that learns a deep generative model $p_\theta(\mathbf{x},\mathbf{h})$ and a corresponding approximate posterior $q_\phi(\mathbf{h}|\mathbf{x})$ by drawing samples from a non-stationary Markov chain and estimating gradients with these samples. The non-stationary kernel problem refers to the degraded performance of the algorithm due to the constant change of the transition kernel of the chain throughout the run of the algorithm. We present an automatic adaptive strategy for checking whether this change is significant at each gradient-update step or not, and resetting the chain with a sample drawn from the current approximate posterior $q_\phi(\mathbf{h}|\mathbf{x})$ if the answer to the check is yes. In our experiments with the binarized MNIST, our strategy gives results comparable with or slightly better than those reported in the original paper on JSA, while avoiding the nontrivial manual intervention required for handling the non-stationary kernel problem in the original JSA algorithm.
1 Reply
Loading