everyone
since 13 Oct 2023">EveryoneRevisionsBibTeX
Collider bias, which comes from non-random sample selection caused by both treatments and outcomes, is a significant and challenging problem of treatment effect estimation. Previous studies show that treatment effects are identifiable if some shadow variables are available in the observational data. Shadow variables are assumed to be fully observed covariates independent of the sample selection mechanism after conditioning on the outcome and other observed covariates. However, finding a well-defined shadow variable is often not an easier task than the task of dealing with collider bias itself in real-world scenarios. Therefore, we propose a novel ShadowCatcher that automatically generates representations serving the role of shadow variables from the observed covariates. Specifically, during the generation process, we impose conditional independence constraints on the learned representations to make them satisfy the assumptions of shadow variables. To further ensure that the generated representations are valid, we also use a tester to perform hypothesis testing and iteratively carry out the generation process until the generated representations pass the test. Using the generated representations, we propose a novel ShadowEstimator to estimate treatment effects under collider bias. Experimental results on both synthetic and real-world datasets demonstrate the effectiveness of our proposed ShadowCatcher and ShadowEstimator.