Conceptualizing Treatment Leakage in Text-based Causal InferenceDownload PDF

Anonymous

16 Jan 2022 (modified: 05 May 2023)ACL ARR 2022 January Blind SubmissionReaders: Everyone
Abstract: Causal inference methods that control for text-based confounders are becoming increasingly important in the social sciences and other disciplines where text is readily available. However, these methods rely on a critical assumption that there is no treatment leakage: that is, the text contains only information about the confounder and no information about treatment assignment (leading to post-treatment bias). However, this assumption may be unrealistic in real-world situations involving text, as human language is rich and flexible. We first define the leakage problem, discussing the identification and estimation challenges it raises. We also discuss the conditions under which leakage can be addressed by removing the treatment-related signal from the text in a pre-processing step we define as \emph{text distillation}. Then, using simulation, we investigate the mechanics of treatment leakage on estimates of the average treatment effect (ATE).
Paper Type: short
0 Replies

Loading