TL;DR: We develop a novel algorithm to falsify the assumption of no unmeasured confounding using multi-environment observational data.
Abstract: A major challenge in estimating treatment effects in observational studies is the reliance on untestable conditions such as the assumption of no unmeasured confounding. In this work, we propose an algorithm that can falsify the assumption of no unmeasured confounding in a setting with observational data from multiple heterogeneous sources, which we refer to as environments. Our proposed falsification strategy leverages a key observation that unmeasured confounding can cause observed causal mechanisms to appear dependent. Building on this observation, we develop a novel two-stage procedure that detects these dependencies with high statistical power while controlling false positives. The algorithm does not require access to randomized data and, in contrast to other falsification approaches, functions even under transportability violations when the environment has a direct effect on the outcome of interest. To showcase the practical relevance of our approach, we show that our method is able to efficiently detect confounding on both simulated and semi-synthetic data.
Lay Summary: Estimating how effective an intervention is—such as giving a drug to a patient or implementing a new public policy—using real-world data is incredibly important, but also challenging. To make these estimates, researchers often assume that all relevant factors influencing both who receives the treatment and what happens afterward have been measured. If these factors, known as confounders, are missing, the estimated effects can be misleading; potentially leading to unsafe or ineffective recommendations. We introduce a new way to assess whether all confounders have been accounted for, in settings where data is collected from multiple groups, such as different hospitals, schools, or regions. Our approach builds on a principle from causal reasoning: in a well-functioning system, different parts of the cause-and-effect process should operate independently. If they appear unexpectedly linked, it may indicate the presence of unmeasured confounding. We propose a method that uses this idea to detect such unexpected links to assess whether all relevant confounders have been measured. This allows researchers to evaluate the reliability of their treatment effect estimates and avoid drawing incorrect conclusions from real-world data.
Link To Code: https://github.com/RickardKarl/falsification-unconfoundedness
Primary Area: General Machine Learning->Causality
Keywords: causal inference, observational data, falsification, unmeasured confounding, independent causal mechanisms
Submission Number: 1866
Loading