Abstract: Survival analysis, or time-to-event analysis,
is an important and widespread problem in
healthcare research. Medical research has traditionally relied on Cox models for survival analysis, due to their simplicity and interpretability. Cox models assume a log-linear hazard
function as well as proportional hazards over
time, and can perform poorly when these assumptions fail. Newer survival models based on
machine learning avoid these assumptions and
offer improved accuracy, yet sometimes at the
expense of model interpretability, which is vital for clinical use. We propose a novel survival
analysis pipeline that is both interpretable and
competitive with state-of-the-art survival models. Specifically, we use an improved version of
survival stacking to transform a survival analysis problem to a classification problem, ControlBurn to perform feature selection, and Explainable Boosting Machines to generate interpretable predictions. To evaluate our pipeline,
we predict risk of heart failure using a largescale EHR database. Our pipeline achieves
state-of-the-art performance and provides interesting and novel insights about risk factors for
heart failure.
Loading