Abstract: Estimating how a treatment affects units individually, known as heterogeneous treatment effect (HTE) estimation, is an essential part
of decision-making and policy implementation. The accumulation
of large amounts of data in many domains, such as healthcare and
e-commerce, has led to increased interest in developing data-driven
algorithms for estimating heterogeneous effects from observational
and experimental data. However, these methods often make strong
assumptions about the observed features and ignore the underlying
causal model structure, which can lead to biased HTE estimation.
At the same time, accounting for the causal structure of real-world
data is rarely trivial since the causal mechanisms that gave rise
to the data are typically unknown. To address this problem, we
develop a feature selection method that considers each feature’s
value for HTE estimation and learns the relevant parts of the causal
structure from data. We provide strong empirical evidence that
our method improves existing data-driven HTE estimation methods under arbitrary underlying causal structures. Our results on
synthetic, semi-synthetic, and real-world datasets show that our
feature selection algorithm leads to lower HTE estimation error.
Loading