CARMS: Categorical-Antithetic-REINFORCE Multi-Sample Gradient Estimator

Alek Dimitriev; Mingyuan Zhou

CARMS: Categorical-Antithetic-REINFORCE Multi-Sample Gradient Estimator

Alek Dimitriev, Mingyuan Zhou

Published: 09 Nov 2021, Last Modified: 26 May 2025NeurIPS 2021 PosterReaders: Everyone

Keywords: Discrete latent variables, gradient estimation, antithetic, unbiased, low variance, probabilistic modeling, variational autoencoder, copula

TL;DR: A novel gradient estimator for categorical variables based on antithetic sampling using copulas

Abstract: Accurately backpropagating the gradient through categorical variables is a challenging task that arises in various domains, such as training discrete latent variable models. To this end, we propose CARMS, an unbiased estimator for categorical random variables based on multiple mutually negatively correlated (jointly antithetic) samples. CARMS combines REINFORCE with copula based sampling to avoid duplicate samples and reduce its variance, while keeping the estimator unbiased using importance sampling. It generalizes both the ARMS antithetic estimator for binary variables, which is CARMS for two categories, as well as LOORF/VarGrad, the leave-one-out REINFORCE estimator, which is CARMS with independent samples. We evaluate CARMS on several benchmark datasets on a generative modeling task, as well as a structured output prediction task, and find it to outperform competing methods including a strong self-control baseline. The code is publicly available.

Code Of Conduct: I certify that all co-authors of this work have read and commit to adhering to the NeurIPS Statement on Ethics, Fairness, Inclusivity, and Code of Conduct.

Supplementary Material: pdf

Code: https://github.com/alekdimi/carms

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/carms-categorical-antithetic-reinforce-multi/code)

14 Replies

Loading