REBAR: Low-variance, unbiased gradient estimates for discrete latent variable modelsDownload PDF

28 Mar 2024 (modified: 17 Mar 2017)ICLR 2017 workshop submissionReaders: Everyone
Abstract: Learning in models with discrete latent variables is challenging due to high variance gradient estimators. Generally, approaches have relied on control variates to reduce the variance of the REINFORCE estimator. Recent work (Jang et al. 2016, Maddison et al. 2016) has taken a different approach, introducing a continuous relaxation of discrete variables to produce low-variance, but biased, gradient estimates. In this work, we combine the two approaches through a novel control variate that produces low-variance, unbiased gradient estimates. We present encouraging preliminary results on a toy problem and on learning sigmoid belief networks.
TL;DR: Combining REINFORCE with the Concrete relaxation to get low variance, unbiased gradient estimates.
Keywords: Unsupervised Learning, Reinforcement Learning, Optimization
Conflicts: google.com, stats.ox.ac.uk
3 Replies

Loading