Open Peer Review. Open Publishing. Open Access. Open Discussion. Open Directory. Open Recommendations. Open API. Open Source.
REBAR: Low-variance, unbiased gradient estimates for discrete latent variable models
George Tucker, Andriy Mnih, Chris J. Maddison, Jascha Sohl-Dickstein
Feb 11, 2017 (modified: Mar 17, 2017)ICLR 2017 workshop submissionreaders: everyone
Abstract:Learning in models with discrete latent variables is challenging due to high variance gradient estimators. Generally, approaches have relied on control variates to reduce the variance of the REINFORCE estimator. Recent work (Jang et al. 2016, Maddison et al. 2016) has taken a different approach, introducing a continuous relaxation of discrete variables to produce low-variance, but biased, gradient estimates. In this work, we combine the two approaches through a novel control variate that produces low-variance, unbiased gradient estimates. We present encouraging preliminary results on a toy problem and on learning sigmoid belief networks.
TL;DR:Combining REINFORCE with the Concrete relaxation to get low variance, unbiased gradient estimates.