TL;DR: We propose an integer program for computing the worst-case copula used in the causal bootstrap, which improves the estimation of design uncertainty in experimental causal inference, generalizing the settings in which this bootstrap applies.
Abstract: In experimental causal inference, we distinguish between two sources of uncertainty: design uncertainty, due to the treatment assignment mechanism, and sampling uncertainty, when the sample is drawn from a super-population. This distinction matters in settings with small fixed samples and heterogeneous treatment effects, as in geographical experiments. The standard bootstrap procedure most often used by practitioners primarily estimates sampling uncertainty, and the causal bootstrap procedure, which accounts for design uncertainty, was developed for the completely randomized design and the difference-in-means estimator, whereas non-standard designs and estimators are often used in these low-power regimes. We address this gap by proposing an integer program which computes numerically the worst-case copula used as an input to the causal bootstrap method, in a wide range of settings. Specifically, we prove the asymptotic validity of our approach for unconfounded, conditionally unconfounded, and and individualistic with bounded confoundedness assignments, as well as generalizing to any linear-in-treatment and quadratic-in-treatment estimators. We demonstrate the refined confidence intervals achieved through simulations of small geographical experiments.
Lay Summary: When experimentally testing new ideas on small samples, like a new policy on a few different geographical regions, it can be hard to tell if observed effects are due to how the policy was assigned to each region, or just due to chance in how the population was formed. Standard statistical methods often don't fully capture the former, especially in complex experiments with varied individual responses. We introduce a method using mathematical optimization (integer programming) to figure out the most challenging, yet plausible, way potential outcomes (both observed and unobserved) could be linked. This "worst-case" understanding is then used to more accurately bound the uncertainty arising from the experimental design itself. Our approach is flexible, working across a wider range of experimental setups and effect measurements than previous techniques. This research offers a way to get more trustworthy and often more precise estimates of uncertainty (confidence intervals), particularly in situations with limited data, leading to better-informed choices based on experimental evidence.
Primary Area: General Machine Learning->Causality
Keywords: design uncertainty, causal bootstrap, integer programming
Submission Number: 7698
Loading