Integrative $R$-learner of heterogeneous treatment effects combining experimental and observational studies
Keywords: Causal inference, Double penalization, Empirical risk minimization, Hidden confounding, Series estimator
Abstract: The gold-standard approach to estimating heterogeneous treatment effects (HTEs) is randomized controlled trials (RCTs)/controlled experimental studies, where treatment randomization mitigates confounding biases. However, experimental data are usually small in sample size and limited in subjects' diversity due to expensive costs. On the other hand, large observational studies (OSs) are becoming increasingly popular and accessible. However, OSs might be subject to hidden confounding whose existence is not testable. We develop an integrative $R$-learner for the HTE and confounding function by leveraging experimental data for identification and observational data for boosting efficiency. We form a regularized loss function for the HTE and confounding function that bears the Neyman orthogonality property, allowing flexible models for the nuisance function estimation. The key novelty of the proposed integrative $R$-learner is to impose different regularization terms for the HTE and confounding function so that the possible smoothness or sparsity of the confounding function can be leveraged to improve the HTE estimation. Our integrative $R$-learner has two benefits: first, it provides a general framework that can accommodate various HTE models for loss minimization; second, without any prior knowledge of hidden confounding in the OS, the proposed integrative $R$-learner is consistent and asymptotically at least as efficient as the estimator using only the RCT. The experiments based on extensive simulation and a real-data application adapted from an educational experiment show that the proposed integrative $R$-learner outperforms alternative approaches.
Supplementary Material: zip
10 Replies
Loading