Overcoming Lookback Window Limitations: Exploring Longer Windows in Long-Term Time Series Forecasting

Jiahui Zhang; Wenjie Du; Zhengyang Zhou; Qihe Huang; Xuqiang Li; Yang Wang

Overcoming Lookback Window Limitations: Exploring Longer Windows in Long-Term Time Series Forecasting

Jiahui Zhang, Wenjie Du, Zhengyang Zhou, Qihe Huang, Xuqiang Li, Yang Wang

26 Sept 2024 (modified: 25 Nov 2024)ICLR 2025 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: long-term time series forcasting, Mamba, Information Bottleneck

TL;DR: We adopts a Transformer-Mamba hybrid architecture and information bottleneck for extreme long lookback windows in LSTF.

Abstract: Long-term time series forecasting (LTSF) aims to predict future trends based on historical data. While longer lookback windows theoretically provide more comprehensive insights, current Transformer-based models face the Lookback Window Limitation (LWL). On one hand, longer windows introduce redundant information, which can hinder model learning. On the other hand, Transformers tend to overfit temporal noise rather than extract meaningful temporal information when dealing with longer sequences, compounded by their quadratic complexity. In this paper, we aim to overcome LWL, enabling models to leverage more historical information for improved performance. Specifically, to mitigate information redundancy, we introduce the Information Bottleneck Filter (IBF), which applies information bottleneck theory to extract essential subsequences from the input. Additionally, to address the limitations of the Transformer architecture in handling long sequences, we propose the Hybrid-Transformer-Mamba (HTM), which combines the linear complexity and long-range modeling capabilities of Mamba with the Transformer's strength in modeling short sequences. We integrate these two model-agnostic modules into various existing methods and conduct experiments on seven datasets. The results demonstrate that incorporating these modules effectively overcomes the lookback window limitations. Notably, by combining them with the Patch strategy, we design the PIH (\textbf{P}atch-\textbf{I}BF-\textbf{H}TM), successfully extending the window length to 1024—a significantly larger window than previously achieved—and achieving state-of-the-art results, highlighting the potential of exploring even longer windows.

Primary Area: learning on time series and dynamical systems

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 6771

Loading