REBAR: Low-variance, unbiased gradient estimates for discrete latent variable modelsDownload PDF

Oct 17, 2021 (edited Mar 17, 2017)ICLR 2017 workshop submissionReaders: Everyone
  • TL;DR: Combining REINFORCE with the Concrete relaxation to get low variance, unbiased gradient estimates.
  • Abstract: Learning in models with discrete latent variables is challenging due to high variance gradient estimators. Generally, approaches have relied on control variates to reduce the variance of the REINFORCE estimator. Recent work (Jang et al. 2016, Maddison et al. 2016) has taken a different approach, introducing a continuous relaxation of discrete variables to produce low-variance, but biased, gradient estimates. In this work, we combine the two approaches through a novel control variate that produces low-variance, unbiased gradient estimates. We present encouraging preliminary results on a toy problem and on learning sigmoid belief networks.
  • Keywords: Unsupervised Learning, Reinforcement Learning, Optimization
  • Conflicts:,
3 Replies