Analyzing Deep Transformer Models for Time Series Forecasting via Manifold Learning

TMLR Paper2803 Authors

05 Jun 2024 (modified: 17 Oct 2024)Decision pending for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Transformer models have consistently achieved remarkable results in various domains such as natural language processing and computer vision. However, despite ongoing research efforts to better understand these models, they still lack a comprehensive understanding. This is particularly true for deep time series forecasting methods, where analysis and understanding work is relatively limited. Time series data, unlike image and text information, can be more challenging to interpret and analyze. To address this, we approach the problem from a \emph{manifold learning} perspective, assuming that the latent representations of time series forecasting models lie next to a low-dimensional manifold. In our study, we focus on analyzing the geometric features of these latent data manifolds, including intrinsic dimension and principal curvatures. Our findings reveal that deep transformer models exhibit similar geometric behavior across layers, and these geometric features are correlated with model performance. Additionally, we observe that untrained models initially have different structures, but they rapidly converge during training. By leveraging our geometric analysis and differentiable tools, we can potentially design new and improved deep forecasting neural networks. This approach complements existing analysis studies and contributes to a better understanding of transformer models in the context of time series forecasting.
Submission Length: Regular submission (no more than 12 pages of main content)
Code: https://github.com/azencot-group/GATLM
Assigned Action Editor: ~Gabriel_Loaiza-Ganem1
Submission Number: 2803
Loading