Keywords: long-term time series, information bottleneck
TL;DR: We enhance long-term time series forecasting by proposing MEW, a new metric, and two modules (IBF and HTM) that help Transformers better use historical data.
Abstract: Long-term time series forecasting (LTSF) aims to predict future trends based on historical data. While longer lookback windows theoretically offer more comprehensive insights, Transformer-based models often struggle with them. On one hand, longer windows introduce more noise and redundancy, hindering the model's learning process. On the other hand, Transformers suffer from attention dispersion and are prone to overfitting to noise, especially when processing long sequences. In this paper, we introduce the Maximum Effective Window (MEW) metric to assess a model's ability to effectively utilize the lookback window. We also propose two model-agnostic modules to enhance MEW, enabling models to better leverage historical data for improved performance. Specifically, to reduce redundancy and noise, we introduce the Information Bottleneck Filter (IBF), which employs information bottleneck theory to extract the most essential subsequences from the input. Additionally, we propose the Hybrid-Transformer-Mamba (HTM), which incorporates the Mamba mechanism for selective forgetting of long sequences while harnessing the Transformer's strong modeling capabilities for shorter sequences. We integrate these two modules into various Transformer-based models, and experimental results show that they effectively enhance MEW, leading to improved overall performance. Our code is available at \url{https://github.com/forever-ly/PIH}.
Primary Area: Applications (e.g., vision, language, speech and audio, Creative AI)
Submission Number: 9286
Loading