Keywords: survival analysis, longitudinal extrapolation, life expectancy, electronic healthcare records
TL;DR: Improving survival extrapolations by enforcing causal relationship priors from epidemiological studies to ensure clinically meaningful and population-consistent projections
Abstract: Accurate long-term risk estimation is critical for managing chronic diseases. Survival analysis provides a framework to quantify the causal effect of risk factors on time-to-event outcomes, but short follow-up cohorts common in clinical studies lead to heavy censoring and limit the estimation of long-term effects. Existing extrapolation methods often focus on population-level outcomes and rely on loosely defined external data, such as expert heuristics, limiting their utility for personalised risk estimation. We propose LongSurv, a framework that extrapolates individual-level survival trajectories from short-term electronic healthcare records (EHR) data by integrating epidemiological priors like hazard ratios and relative risks. These priors are incorporated in training via two loss functions: a life expectancy consistency loss that aligns predictions with demographic expectations, and a group-wise ranking loss to preserve clinically valid risk orderings. Evaluated on 4595 post-PCI patients from Maharashtra, India (95% censoring), outperforms a Weibull baseline in discrimination (C-index 0.6946 vs. 0.5227) and calibration (IBS 0.0390 vs. 0.0494), while enabling counterfactual reasoning for personalised care. By connecting short-term observational data with long-term causal survival insights, LongSurv provides an interpretable and scalable approach to risk estimation.
Submission Number: 53
Loading