Automated Detection of Causal Inference Opportunities: Regression Discontinuity Subgroup Discovery

Published: 10 Nov 2023, Last Modified: 10 Nov 2023Accepted by TMLREveryoneRevisionsBibTeX
Abstract: The gold standard for the identification of causal effects are randomized controlled trials (RCT), but RCTs may not always be feasible to conduct. When treatments depend on a threshold however, such as the blood sugar threshold for diabetes diagnosis, we can still sometimes estimate causal effects with regression discontinuities (RDs). RDs are valid when units just above and below the threshold have the same distribution of covariates and thus no confounding in the presence of noise, establishing an as-if randomization. In practice however, implementing RD studies can be difficult as identifying treatment thresholds require considerable domain expertise -- furthermore, the thresholds may differ across subgroups (e.g., the blood sugar threshold for diabetes may differ across demographics), and ignoring these differences can lower statistical power. Finding the thresholds and to whom they apply is an important problem currently solved manually by domain experts, and data-driven approaches are needed when domain expertise is not sufficient. Here, we introduce Regression Discontinuity SubGroup Discovery (RDSGD), a machine-learning method that identifies statistically powerful and interpretable subgroups for RD thresholds. Using a medical claims dataset with over 60 million patients, we apply RDSGD to multiple clinical contexts and identify subgroups with increased compliance to treatment assignment thresholds. As treatment thresholds matter for many diseases and policy decisions, RDSGD can be a powerful tool for discovering new avenues for causal estimation.
Submission Length: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: Camera-ready copy. Added additional discussion to Section 7 on use cases and power limitations according to the Action Editor's revision requests.
Code: https://github.com/tliu526/rdsgd
Supplementary Material: zip
Assigned Action Editor: ~Novi_Quadrianto1
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Submission Number: 1258
Loading