DeformableTST: Transformer for Time Series Forecasting without Over-reliance on Patching

Donghao Luo; Xue Wang

DeformableTST: Transformer for Time Series Forecasting without Over-reliance on Patching

Donghao Luo, Xue Wang

Published: 25 Sept 2024, Last Modified: 06 Nov 2024NeurIPS 2024 posterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Time series forecasting, Transformer, Deep learning

TL;DR: We propose DeformableTST, a Transformer-based model less reliant on patching, to broaden the applicability of Transformer-based models in time series forecasting tasks and achieves SOTA performance in a wider range of time series forecasting tasks.

Abstract: With the proposal of patching technique in time series forecasting, Transformerbased models have achieved compelling performance and gained great interest from the time series community. But at the same time, we observe a new problem that the recent Transformer-based models are overly reliant on patching to achieve ideal performance, which limits their applicability to some forecasting tasks unsuitable for patching. In this paper, we intent to handle this emerging issue. Through diving into the relationship between patching and full attention (the core mechanism in Transformer-based models), we further find out the reason behind this issue is that full attention relies overly on the guidance of patching to focus on the important time points and learn non-trivial temporal representation. Based on this finding, we propose DeformableTST as an effective solution to this emerging issue. Specifically, we propose deformable attention, a sparse attention mechanism that can better focus on the important time points by itself, to get rid of the need of patching. And we also adopt a hierarchical structure to alleviate the efficiency issue caused by the removal of patching. Experimentally, our DeformableTST achieves the consistent state-of-the-art performance in a broader range of time series tasks, especially achieving promising performance in forecasting tasks unsuitable for patching, therefore successfully reducing the reliance on patching and broadening the applicability of Transformer-based models. Code is available at this repository: https://github.com/luodhhh/DeformableTST.

Primary Area: Other (please use sparingly, only use the keyword field for more details)

Submission Number: 5480

Loading