CGTFra: General Graph Transformer Framework for Consistent Inter-series Dependency Modeling in Multivariate Time Series

ICLR 2026 Conference Submission7185 Authors

16 Sept 2025 (modified: 21 Nov 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Graph transformer, Multivariate time series forecasting, Intra-series dependency modeling, Inter-series dependency modeling
TL;DR: we propose a Graph Transformer framework, CGTFra, designed to ensure consistent IVD modeling.
Abstract: Transformers have emerged as dominant predictors in multivariate time series forecasting (MTSF), prompting an in-depth investigation into their limitations within this application. Firstly, the conventional temporal information for timestamp in MTSF suffers from the unavailability of future timestamps and the diversity of timestamp formats across real-world datasets, which poses a significant practical challenge and necessitates cumbersome adjustments for a unified forecasting model. Secondly, existing variate Transformers, such as iTransformer, typically model inter-variate dependencies (IVD) predominantly within shallow self-attention layers, neglecting the critical requirement for deep-layer IVD modeling, thereby causing dependency information loss and difficulties in model optimization. We refer to this phenomenon as inconsistent IVD modeling. To address these limitations, CGTFra, a Graph Transformer framework, is designed to promote consistent IVD modeling. Specifically, we introduce a frequency-domain masking and resampling method for feature enhancement that preserves periodic characteristics in the frequency domain. Additionally, by comprehensive analysis of the distinctions and connections between self-attention mechanisms of variate Transformers and Graph Neural Networks (GNNs) in capturing IVD, a dynamic graph learning framework is integrated into the Transformer to explicitly model IVD in deep network layer. Crucially, we then propose a consistency-constrained alignment to strengthen the network to learn more robust IVD and temporal feature representations. The core design philosophy of CGTFra can be integrated into any existing Variate Transformer-based framework and CGTFra demonstrates superior predictive performance across 13 long- and short-term datasets with high computational efficiency. Code is available at https://anonymous.4open.science/r/CGTFra.
Supplementary Material: zip
Primary Area: learning on time series and dynamical systems
Submission Number: 7185
Loading