MGTST: Multi-scale and Cross-channel Gated Transformer for Multivariate long-term time-series forecasting
Transformer-based models have emerged as popular choices for the multivariate long-term time-series forecasting problem due to their ability to capture long-term dependencies. However, current transformer-based models either overlook crucial mutual dependencies among channels or fail to capture various temporal patterns across different scales. To fill the gap, we propose a novel model called MGTST (Multi-scale and cross-channel Gated Time-Series Transformer). In this model, we introduce three innovative designs, including Parallel Multi-Scale Architecture (PMSA), Temporal Embedding with Representation Tokens (TERT), and Cross-Channel Attention and Gated Mechanism (CCAGM). In addition, we introduce Channel Grouping (CG) to mitigate channel interaction redundancy for datasets with a large number of channels. The experimental results demonstrate that our model outperforms both channel-dependent (CD) and channel-independent (CI) baseline models on seven widely used benchmark datasets, with performance improvement ranging from 1.5 percent to 41.9 percent compared to the state-of-the-art in terms of forecasting accuracy.