On Efficient Estimation of Distributional Treatment Effects under Covariate-Adaptive Randomization

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
TL;DR: Use ML to fold extra covariates into balanced trials and reveal the full outcome distribution with optimal precision—beyond averages.
Abstract: This paper focuses on the estimation of distributional treatment effects in randomized experiments that use covariate-adaptive randomization (CAR). These include designs such as Efron's biased-coin design and stratified block randomization, where participants are first grouped into strata based on baseline covariates and assigned treatments within each stratum to ensure balance across groups. In practice, datasets often contain additional covariates beyond the strata indicators. We propose a flexible distribution regression framework that leverages off-the-shelf machine learning methods to incorporate these additional covariates, enhancing the precision of distributional treatment effect estimates. We establish the asymptotic distribution of the proposed estimator and introduce a valid inference procedure. Furthermore, we derive the semiparametric efficiency bound for distributional treatment effects under CAR and demonstrate that our regression-adjusted estimator attains this bound. Simulation studies and analyses of real-world datasets highlight the practical advantages of our method.
Lay Summary: When researchers evaluate new interventions—whether a medical treatment, education policy, or product feature—they often randomize participants so treatment and control groups look alike on a few key traits such as age or gender. Yet the data they collect record many other details, and those are usually ignored. We show how to feed these extra details into off-the-shelf machine-learning tools to track how the full range of outcomes—winners, losers, and everyone in between—shifts under the interventions, not just the average effect. Our mathematical results show the method squeezes every drop of information the data can offer, so no rival approach can do better. Sharper, distribution-wide evidence helps policymakers target resources where they matter most instead of relying on one-size-fits-all averages drawn from traditional randomized trials.
Link To Code: https://github.com/CyberAgentAILab/dte_car
Primary Area: General Machine Learning->Causality
Keywords: treatment effects, field experiments, covariate-adaptive randomization, regression adjustment, causal inference, semiparametric estimation
Submission Number: 6186
Loading