Improved Gradient Estimators for Stochastic Discrete Variables

Evgeny Andriyash; Arash Vahdat; Bill Macready

Improved Gradient Estimators for Stochastic Discrete Variables

Evgeny Andriyash, Arash Vahdat, Bill Macready

27 Sept 2018 (modified: 05 May 2023)ICLR 2019 Conference Blind SubmissionReaders: Everyone

Abstract: In many applications we seek to optimize an expectation with respect to a distribution over discrete variables. Estimating gradients of such objectives with respect to the distribution parameters is a challenging problem. We analyze existing solutions including finite-difference (FD) estimators and continuous relaxation (CR) estimators in terms of bias and variance. We show that the commonly used Gumbel-Softmax estimator is biased and propose a simple method to reduce it. We also derive a simpler piece-wise linear continuous relaxation that also possesses reduced bias. We demonstrate empirically that reduced bias leads to a better performance in variational inference and on binary optimization tasks.

Keywords: continuous relaxation, discrete stochastic variables, reparameterization trick, variational inference, discrete optimization, stochastic gradient estimation

TL;DR: We propose simple ways to reduce bias and complexity of stochastic gradient estimators used for learning distributions over discrete variables.

7 Replies

Loading