Keywords: electronic health records, 30-day readmission prediction, causal discovery, machine learning
Abstract: Hospital readmission within 30 days remains an important challenge in healthcare, with implications for care quality, discharge planning, and resource allocation. While machine learning models trained on electronic health records (EHRs) can accurately predict readmission risk, their explanations are typically associative and provide limited insight into the underlying problem structure. In this work, we combine predictive modeling and causal discovery to study 30-day readmission and mortality prediction on a preprocessed MIMIC-IV cohort. We first benchmark several tabular machine learning models and use SHAP-based attribution to identify the most influential predictors. We then apply three causal discovery methods to assess whether stable, clinically plausible dependency patterns emerge beyond purely predictive signals. Our results show that richer EHR feature sets substantially outperform a LACE-only baseline, with tree-based models achieving the strongest predictive performance. Across predictive, causal, and subgroup analyses, a consistent core set of variables centered on Physical Status, Length, LACE score, Comorbidity, and Age remains structurally important. These findings suggest that combining standard predictive models with causal discovery can provide a compact, interpretable structural view of readmission risk, offering a first step toward causally informed intervention design to reduce preventable readmissions.
Submission Number: 84
Loading