Abstract: Accurate and robust time series forecasting is critical in numerous domains but remains challenging due to issues such as temporal misalignments, noise, and complex multivariate dependencies. Transformer-based models have demonstrated strong performance in sequential tasks; however, their reliance on dot-product attention renders them sensitive to noise and less effective for misaligned time series. To address these limitations, we propose a novel dynamic time warping (DTW)-based attention mechanism, leveraging a Sakoe-Chiba-constrained soft-DTW framework to replace the traditional dot-product similarity. This approach enables dynamic sequence alignment, enhancing robustness to temporal misalignments. Building on this innovation, we introduce DTWformer, a multi-scale Transformer model that integrates DTW-attention with adaptive patching to capture dependencies across varying temporal resolutions. DTWformer achieves superior forecasting performance and efficiency, addressing the limitations of existing approaches in handling misaligned and noisy time series.
External IDs:dblp:journals/tbd/YuWYPFL25
Loading