Learning proposals for sequential importance samplers using reinforced variational inference

Zafarali Ahmed, Arjun Karuvally, Doina Precup, Simon Gravel

Mar 16, 2019 ICLR 2019 Workshop drlStructPred Blind Submission readers: everyone
  • Keywords: variational inference, reinforcement learning, monte carlo methods, stochastic processes
  • Abstract: The problem of inferring unobserved values in a partially observed trajectory from a stochastic process can be considered as a structured prediction problem. Traditionally inference is conducted using heuristic-based Monte Carlo methods. This work considers learning heuristics by leveraging a connection between policy optimization reinforcement learning and approximate inference. In particular, we learn proposal distributions used in importance samplers by casting it as a variational inference problem. We then rewrite the variational lower bound as a policy optimization problem similar to Weber et al. (2015) allowing us to transfer techniques from reinforcement learning. We apply this technique to a simple stochastic process as a proof-of-concept and show that while it is viable, it will require more engineering effort to scale inference for rare observations
0 Replies