Analyzing Deep Transformer Models for Time Series Forecasting via Manifold Learning

23 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: visualization or interpretation of learned representations
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: representation learning, manifold analysis, deep neural networks, time series forecasting
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Abstract: Deep transformer models consistently achieve groundbreaking results on natural language processing and computer vision problems, among other engineering and scientific domains. However, despite active research that aims to better understand transformer neural networks via e.g., computing saliency scores or analyzing their attention matrix, these models are not well-understood at large. This problem is further exacerbated for deep time series forecasting methods, for which analysis and understanding work is relatively scarce. Indeed, deep time series forecasting methods only recently emerged as state-of-the-art, and moreover, time series data may be less ``natural'' to interpret and analyze, unlike image and text information. Complimentary to existing analysis studies, we employ a manifold learning viewpoint, i.e., we assume that latent representations of time series forecasting models lie next to a low-dimensional manifold. In this work, we study geometric features of latent data manifolds including their intrinsic dimension and principal curvatures. Our results demonstrate that deep transformer models share a similar geometric behavior across layers, and that geometric features are correlated with model performance. Further, untrained models present different structures, which rapidly converge during training. Our geometric analysis and differentiable tools may be used in designing new and improved deep forecasting neural nets.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 6966
Loading