A kernel balancing approach that scales to big data

Kwangho Kim; Bijan A Niknam; Jose R Zubizarreta

A kernel balancing approach that scales to big data

Kwangho Kim, Bijan A Niknam, Jose R Zubizarreta

03 Oct 2022 (modified: 05 May 2023)CML4ImpactReaders: Everyone

Keywords: Causal Inference, Observational Studies, Weighting Methods, Convex Optimization, Propensity Score

TL;DR: We propose a stable kernel balancing approach to weighting that scales to large datasets without sacrificing estimator accuracy

Abstract: In causal inference, weighting is commonly used for covariate adjustment. Procedurally, weighting can be accomplished either through methods that model the propensity score, or methods that use convex optimization to find the weights that balance the covariates directly. However, the computational demand of the balancing approach has to date precluded it from including broad classes of functions of the covariates in large datasets. To address this problem, we outline a scalable approach to balancing that incorporates a kernel representation of a broad class of basis functions. First, we use the Nystr\"{o}m method to rapidly generate a kernel basis in a reproducing kernel Hilbert space containing a broad class of basis functions of the covariates. Then, we integrate these basis functions as constraints in a state-of-the-art implementation of the alternating direction method of multipliers, which rapidly finds the optimal weights that balance the general basis functions in the kernel. Using this kernel balancing approach, we conduct a national observational study of the relationship between hospital profit status and treatment and outcomes of heart attack care in a large dataset containing 1.27 million patients and over 3,500 hospitals. After weighting, we observe that for-profit hospitals perform percutaneous coronary intervention at similar rates as other hospitals; however, their patients have slightly worse mortality and higher readmission rates.

0 Replies

Loading