Quantifying Treatment Effects: Estimating Risk Ratios via Observational Studies

Ahmed BOUGHDIRI; Julie Josse; Erwan Scornet

Quantifying Treatment Effects: Estimating Risk Ratios via Observational Studies

Ahmed BOUGHDIRI, Julie Josse, Erwan Scornet

Published: 01 May 2025, Last Modified: 23 Jul 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

Abstract: The Risk Difference (RD), an absolute measure of effect, is widely used and well-studied in both randomized controlled trials (RCTs) and observational studies. Complementary to the RD, the Risk Ratio (RR), as a relative measure, is critical for a comprehensive understanding of intervention effects: RD can downplay small absolute changes, while RR can highlight them. Despite its significance, the theoretical study of RR has received less attention, particularly in observational settings. This paper addresses this gap by tackling the estimation of RR in observational data. We propose several RR estimators and establish their theoretical properties, including asymptotic normality and confidence intervals. Through analyses on simulated and real-world datasets, we evaluate the performance of these estimators in terms of bias, efficiency, and robustness to generative data models. We also examine the coverage and length of the associated confidence intervals. Due to the non-linear nature of RR, influence function theory yields two distinct efficient estimators with different convergence assumptions. Based on theoretical and empirical insights, we recommend, among all estimators, one of the two doubly-robust estimators, which, intriguingly, challenges conventional expectations.

Lay Summary: Causal inference is a field of research that helps scientists and doctors figure out whether a treatment or intervention actually causes a change in people’s health, rather than just being linked to it by coincidence. One commonly used way to measure the effect of a treatment is the Risk Ratio, which compares how likely an outcome (e.g., getting sick) is between two groups such as those who received a treatment and those who did not. We examine methods for estimating the Risk Ratio when using data from real-world settings, where treatments are not randomly assigned to people. We also look closely at what happens to these estimates as the number of people in the study gets very large.

Primary Area: General Machine Learning->Causality

Keywords: Causal inference, Observational data, Risk Ratio

Submission Number: 11176

Loading