Gateformer: Advancing Multivariate Time Series Forecasting via Temporal and Variate-Wise Attention with Gated Representations

Published: 09 Jun 2025, Last Modified: 09 Jun 2025FMSD @ ICML 2025EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Time series forecasting
TL;DR: We proposed Gateformer, which combines cross-time and cross-variate attention with a gating fusion module for enhanced time series forecasting.
Abstract: Transformer-based models have recently shown promise in time series forecasting, yet effectively modeling multivariate time series remains challenging due to the need to capture both temporal (cross-time) and variate (cross-variate) dependencies. While prior methods attempt to address both, it remains unclear how to optimally integrate these dependencies within the Transformer architecture for both accuracy and efficiency. We re-purpose the Transformer to explicitly model these two types of dependencies: first embedding each variate independently to capture temporal dynamics, then applying attention over these embeddings to model cross-variate relationships. Gating mechanisms in both stages regulate information flow, enabling the model to focus on relevant features. Our approach achieves state-of-the-art performance on 13 real-world datasets and can be integrated into Transformer-, LLM-based, and foundation time series models, improving performance by up to 20.7\%.
Submission Number: 9
Loading