Let the Experts Speak: Improving Survival Prediction & Calibration via Mixture-of-Experts Heads

Todd Morrill; Aahlad Manas Puli; Murad Megjhani; Soojin Park; Richard Zemel

Let the Experts Speak: Improving Survival Prediction & Calibration via Mixture-of-Experts Heads

Todd Morrill, Aahlad Manas Puli, Murad Megjhani, Soojin Park, Richard Zemel

Published: 23 Sept 2025, Last Modified: 01 Dec 2025TS4H NeurIPS 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: survival analysis, mixture-of-experts, clustering, calibration, accuracy

TL;DR: We introduce 3 novel MoE architectures for survival analysis, one of which achieves excellent patient clustering, calibration, and predictive accuracy and conclude that the expressiveness of the experts is key for performance

Abstract: Deep mixture-of-experts models have attracted a lot of attention for survival analysis problems, particularly for their ability to cluster similar patients together. In practice, grouping often comes at the expense of key metrics such as calibration error and predictive accuracy. This is due to the restrictive inductive bias that mixture-of-experts imposes, that predictions for individual patients must look like predictions for the group they're assigned to. Might we be able to discover patient group structure, where it exists, while improving calibration and predictive accuracy? In this work, we introduce several novel deep mixture-of-experts (MoE)-based architectures for survival analysis problems, one of which achieves all desiderata: clustering, calibration, and predictive accuracy. We show that a key differentiator between this array of MoEs is how expressive their experts are. We find that more expressive experts that tailor predictions per patient outperform experts that rely on fixed group prototypes.

Submission Number: 71

Loading