Rethinking channel fusion for robust multivariate time series classification under distribution shift
Abstract: In real-world applications of multivariate time series classification (TSC), distribution shift between training and testing data is common, often leading to degraded out-of-distribution (OOD) performance relative to in-distribution (ID) performance. Existing methods typically improve robustness through the training objective or augmentation, and evaluate models that use early fusion, where channels are jointly processed. However, the impact of fusing channels at later stages remains unclear. We show that later fusion structurally isolates channel-specific shifts, preventing a corrupted channel from contaminating the full feature representation as in early fusion. We evaluate this across four HAR datasets and four MI datasets under both subject-level and sensor corruption distribution shifts. Across HAR datasets, later fusion consistently reduces the ID-OOD gap, and models trained with standard ERM outperform domain generalisation algorithms, often substantially. Later fusion also exhibits strong resilience to sensor corruptions, with late fusion showing near-zero degradation even when half of all channels are corrupted. However, these gains are dataset-dependent: on MI datasets, the ID cost of later fusion outweighs its robustness benefits, while domain generalisation algorithms offer little improvement. We additionally propose a simple ID-based heuristic for selecting fusion strategies. Our findings show that fusion strategy is a critical and underexplored design choice for OOD robustness in multivariate TSC, with effects that can rival those of specialised learning algorithms. The code for this work is available at \url{https://...}.
Submission Type: Long submission (more than 12 pages of main content)
Assigned Action Editor: ~Han-Jia_Ye1
Submission Number: 8893
Loading