Learning proposals for sequential importance samplers using reinforced variational inferenceDownload PDF

Published: 10 Apr 2019, Last Modified: 05 May 2023drlStructPred 2019Readers: Everyone
Keywords: variational inference, reinforcement learning, monte carlo methods, stochastic processes
Abstract: The problem of inferring unobserved values in a partially observed trajectory from a stochastic process can be considered as a structured prediction problem. Traditionally inference is conducted using heuristic-based Monte Carlo methods. This work considers learning heuristics by leveraging a connection between policy optimization reinforcement learning and approximate inference. In particular, we learn proposal distributions used in importance samplers by casting it as a variational inference problem. We then rewrite the variational lower bound as a policy optimization problem similar to Weber et al. (2015) allowing us to transfer techniques from reinforcement learning. We apply this technique to a simple stochastic process as a proof-of-concept and show that while it is viable, it will require more engineering effort to scale inference for rare observations
5 Replies