Effectively Designing 2-Dimensional Sequence Models for Multivariate Time Series

Published: 06 Mar 2025, Last Modified: 15 Apr 2025ICLR 2025 Workshop World ModelsEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Multivariate Time Series, linear recurrent model, Transformers
Abstract: Although Transformers dominate fields like language modeling and computer vision, they often underperform simple linear baselines in time series tasks. Conversely, linear sequence models provide an efficient, causally biased alternative that excels at autoregressive processes. However, they are fundamentally limited to single-sequence modeling and cannot capture inter-variate dependencies in multivariate time series. Here, we introduce Typhon, a flexible framework that applies two sequence models to the time and variable dimensions, merging them with a Dimension Mixer module, allowing the inter-variate information flow in the learning process. Building on Typhon, we introduce T4 (Test Time Training with a cross-variate Transformer), which employs a a meta-model for on-the-fly forecasting across time, and a Transformer across variates to capture their dependencies. The Typhon framework’s flexibility lets us benchmark T4 alongside various modern recurrent models, revealing that constant-memory recurrence struggles with long-term dependencies and error propagation. To address this, we introduce Gated Multiresolution Convolution (GMC)—a simple, attention-free Typhon variant. With a carefully designed constant-size multiresolution memory, GMC can capture long-term dependencies while mitigating error propagation. Our experiments validate Typhon’s 2D inductive bias design and demonstrate GMC and T4’s superior performance across diverse benchmarks.
Submission Number: 75
Loading