Machine Learning for Variance Reduction in Online Experiments

Yongyi Guo; Dominic Coey; Mikael Konutgan; Wenting Li; Chris Schoener; Matt Goldman

Machine Learning for Variance Reduction in Online Experiments

Yongyi Guo, Dominic Coey, Mikael Konutgan, Wenting Li, Chris Schoener, Matt Goldman

Published: 09 Nov 2021, Last Modified: 05 May 2023NeurIPS 2021 PosterReaders: Everyone

Keywords: experimentation, variance reduction, agnostic statistics, debiased machine learning, semiparametrics, experiment splitting

Abstract: We consider the problem of variance reduction in randomized controlled trials, through the use of covariates correlated with the outcome but independent of the treatment. We propose a machine learning regression-adjusted treatment effect estimator, which we call MLRATE. MLRATE uses machine learning predictors of the outcome to reduce estimator variance. It employs cross-fitting to avoid overfitting biases, and we prove consistency and asymptotic normality under general conditions. MLRATE is robust to poor predictions from the machine learning step: if the predictions are uncorrelated with the outcomes, the estimator performs asymptotically no worse than the standard difference-in-means estimator, while if predictions are highly correlated with outcomes, the efficiency gains are large. In A/A tests, for a set of 48 outcome metrics commonly monitored in Facebook experiments, the estimator has over $70\%$ lower variance than the simple difference-in-means estimator, and about $19\%$ lower variance than the common univariate procedure which adjusts only for pre-experiment values of the outcome.

Code Of Conduct: I certify that all co-authors of this work have read and commit to adhering to the NeurIPS Statement on Ethics, Fairness, Inclusivity, and Code of Conduct.

TL;DR: We show how to use supervised ML methods to substantially increase precision in experimental causal inference.

Supplementary Material: pdf

11 Replies

Loading