Keywords: Selection Bias, Structural Causal Models, Causal Model Abstraction, Causal Graphs
Abstract: Selection bias is ubiquitous in real-world data, posing a risk of yielding misleading results if not appropriately addressed. We introduce a conditioning operation on simple Structural Causal Models (SCMs), which is a more general model class than acyclic SCMs, to model latent selection from a causal perspective. We show that the conditioning operation transforms an SCM with the presence of an explicit latent selection mechanism into an SCM (without a selection mechanism) encoding as much causal semantics of the selected subpopulation according to the original SCM as possible. Graphically, in Directed Mixed Graphs we extend the semantics of bidirected edges, which originally represent only latent common causes, to also represent latent selection bias. Furthermore, we show that this conditioning operation preserves the simplicity, acyclicity, and linearity of SCMs, and commutes with marginalization and the conditioning itself. Thanks to these properties, combined with marginalization and intervention, the conditioning operation offers a valuable tool for conducting causal modeling, causal reasoning, and causal model learning tasks within causal models where latent details have been abstracted away. Through illustrative examples, we demonstrate how this abstraction process diminishes the complexity inherent in these three tasks, emphasizing both the theoretical clarity and practical utility of our proposed approach. We hope that our results can deepen the understanding of selection bias from the perspective of SCMs and be integrated into the causal modeling toolbox, ultimately helping modelers develop more reliable and trustworthy causal models.
Submission Number: 1
Loading