Keywords: time series, time series foundation model
TL;DR: We propose a zero-shot Time Series Learner via Hierarchical Interleaved Block Attention
Abstract: The rapid advancement of time series foundation models (TSFMs) has been propelled by migrating architectures from language. While existing TSFMs demonstrate impressive performance, their direct adoption of cross-domain architectures constrains effective capture multiscale temporal dependencies inherent to time series data. This limitation becomes particularly pronounced during zero-shot transfer across datasets with divergent underlying patterns and sampling strategies.To address these challenges, we propose Hierarchical Interleaved Block Attention (HIBA) which employs hierarchical inter- and intra-block sparse attention to effectively capture multi-scale dependencies. Intra-block attention facilitates local information exchange, and inter-block attention operates across blocks to capture global temporal pattern interaction and dynamic evolution. Leveraging the HIBA architecture, we introduce Xihe, a scalable TSFM family spanning from an ultra-efficient 9.5M parameter configuration to high-capacity 1.5B variant. Evaluated on the comprehensive GIFT-Eval benchmark, our most compact Xihe-tiny model (9.5M) surpasses the majority of contemporary TSFMs, demonstrating remarkable paramater efficiency. More impressively, Xihe-max (1.5B) establishes new sate-of-the-art zero-shot performance, surpassing previous best results by a substantial margin. This consistent performance excellence across the entire parameter spectrum provides compelling evidence for the exceptional generalization capabilities and architectural superiority of HIBA.
Primary Area: learning on time series and dynamical systems
Submission Number: 16977
Loading