Spatial Correlation Structure Determines the Effectiveness of Channel Mixing Strategies in Time Series Forecasting
Abstract: Channel-dependent (CD) and channel-independent (CI) strategies represent competing inductive biases in long-term time series forecasting. While empirical studies suggest that CD strategies become more effective as channel correlation increases, the specific data characteristics that determine this have not been systematically quantified. We introduce two dataset properties to characterize the effectiveness of CI, CD, and hybrid models: the high-correlation fraction, defined as the proportion of highly correlated channel pairs, and block separation, defined as the degree of separation between channel clusters. Using the hybrid Series-cOre Fused Time Series (SOFTS) model as a controlled testbed, we develop a fully CD variant, Channel Mixer SOFTS (C-SOFTS), that maximizes channel interactions in both the spatial and frequency domains, and a fully CI variant, Identity SOFTS (I-SOFTS), that removes all channel interactions. We find that I-SOFTS consistently outperforms the hybrid on few-channel, low-correlation datasets. C-SOFTS outperforms the hybrid on datasets with high block separation, or with high-correlation fraction and a few clusters, achieving up to 15.9\% average MSE improvement. The hybrid proves optimal only when the high-correlation fraction and block separation are moderately low. These results show that the CI-CD choice is not a universal architectural decision but a dataset-dependent one. We advocate for reporting spatial dataset characteristics alongside performance metrics as a standard practice, enabling practitioners to match inductive biases to data regimes rather than relying on universal architectural rankings.
Submission Type: Long submission (more than 12 pages of main content)
Assigned Action Editor: ~Brian_Quanz2
Submission Number: 8261
Loading