Local Attention: Enhancing the Transformer Architecture for Efficient Time Series Forecasting

Ignacio Aguilera-Martos, Andrés Herrera-Poyatos, Julián Luengo, Francisco Herrera

Published: 09 Sept 2024, Last Modified: 05 May 20262024 International Joint Conference on Neural Networks (IJCNN), Yokohama, JapanEveryoneCC BY 4.0

Abstract: Transformers have emerged as a highly effective architecture for natural language processing and computer vision. Of late, there has been a surge in initiatives aimed at refining this architecture to enhance its applicability to long sequence time-series forecasting, yielding promising outcomes.This paper introduces Local Attention, an efficient attention mechanism tailored for time series data. This mechanism exploits the continuity properties of time series and the principle of locality in order to compute less attention scores. We provide an Θ(n log n) algorithm to implement Local Attention based on tensor algebra results, which contrasts to the Θ(n2) time and memory complexity of the original attention mechanism.Our experimental analysis shows that the vanilla transformer with Local Attention outperforms state of the art models based on probabilistic attention mechanisms. These findings affirm the effectiveness of our approach and outline a spectrum of future challenges in long sequence time series forecasting.