ParallelTime: Dynamically Weighting the Balance of Short- and Long-Term Temporal Dependencies

Itay Katav; Aryeh Kontorovich

ParallelTime: Dynamically Weighting the Balance of Short- and Long-Term Temporal Dependencies

Itay Katav, Aryeh Kontorovich

19 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: time series, forecasting, mamba, transformer, attention, dynamic weighting of temporal dependencies

Abstract: Modern multivariate time series forecasting primarily relies on two architectures: the Transformer with attention mechanism and Mamba. In natural language processing, an approach has been used that combines local window attention for capturing short-term dependencies and Mamba for capturing long-term dependencies, with their outputs averaged to assign equal weight to both. We find that for time-series forecasting tasks, assigning equal weight to long-term and short-term dependencies is not optimal. To mitigate this, we propose a dynamic weighting mechanism, ParallelTime Weighter, which calculates interdependent weights for long-term and short-term dependencies for each token based on the input and the model's knowledge. Furthermore, we introduce the ParallelTime architecture, which incorporates the ParallelTime Weighter mechanism to deliver state-of-the-art performance across diverse benchmarks. Our architecture demonstrates robustness, achieves lower FLOPs, requires fewer parameters, scales effectively to longer prediction horizons, and significantly outperforms existing methods. These advances highlight a promising path for future developments of parallel Attention-Mamba in time series forecasting. The implementation is readily available at the GitHub link.

Supplementary Material: zip

Primary Area: learning on time series and dynamical systems

Submission Number: 20257

Loading