Keywords: pediatric sleep, polysomnography, PHATE, physiological time-series, topological data analysis, trajectory analysis, EHR, multimodal representation learning
TL;DR: By analyzing the trajectory and topology of multimodal sleep embeddings, we reveal interpretable markers of pediatric sleep disorders.
Track: Proceedings
Abstract: While generative models have shown promise in pediatric sleep analysis, the latent structure of their multimodal embeddings remains poorly understood. This work investigates $\textit{session-wide}$ diagnostic information contained in the $\textit{sequences}$ of 30-second pediatric PSG epochs embedded by a multimodal masked autoencoder. We test whether augmenting embeddings with (i) PHATE-derived per-epoch coordinates and whole-night movement descriptors, (ii) persistent homology summaries of the embedding cloud, and (iii) EHR yields task-relevant signals. Simple linear and MLP models, chosen for interpretability rather than state-of-the-art performance, show that geometric, topological, and clinical features each provide complementary gains. For binary predictions, feature importance is task-dependent, and more expressive late-fusion models generally perform better, with AUPRC improving 0.26→0.34 for desaturation, 0.31→0.48 for EEG arousal, 0.09→0.22 for hypopnea, and 0.05→0.14 for apnea. We also report Brier score and Expected Calibration Error, where the full fusion model yields the best calibration across all four binary tasks. Our study reveals that latent geometry/topology and EHR offer complementary, interpretable signals beyond embeddings, improving calibration and robustness under extreme imbalance.
General Area: Models and Methods
Specific Subject Areas: Representation Learning, Explainability & Interpretability, Time Series
Supplementary Material: zip
Data And Code Availability: Yes
Ethics Board Approval: No
Entered Conflicts: I confirm the above
Anonymity: I confirm the above
Submission Number: 268
Loading