Keywords: Medical Time Series, Deep Learning, EEG, Generalization, Disease Detection
TL;DR: Spurious correlation between label and subject-specific features is the key factor causing performance drop on unseen subjects in medical time series-based disease detection tasks.
Abstract: Models for disease detection in medical time series (MedTS) often excel on training subjects but fail to generalize to unseen subjects. In many disease detection datasets, each subject is associated with a single, fixed label, resulting in strong yet spurious correlations. Across EEG- and ECG-based disease detection, spurious identity correlations inflate performance in subject-dependent evaluations (shared subjects across train/test) but collapse under subject-independent splits with unseen test subjects. Our comparative experiments indicate that disease detection models often exploit the shortcut of patient identity, severely limiting their generalization to unseen subjects. These findings highlight the critical need for methods designed to mitigate subject identity as a spurious feature and reinforce the importance of subject-independent setup for clinically meaningful MedTS disease detection.
Primary Area: learning on time series and dynamical systems
Submission Number: 1143
Loading