Keywords: Time Series Forecasting, Large Model, Time Series Foundation Models
Abstract: Time series (TS) forecasting plays a vital role in practice, but remains a highly challenging task. The outstanding performance of large-scale models across multiple domains has driven the advancement of large-scale TS models, providing an effective pathway for forecasting task. Performance degradation has been observed in large-scale TS models, demonstrating that bigger is not always better which is a puzzling phenomenon. We trained two categories of large-scale TS models, LLM4TS and TSFMs, across four scales, examining how architecture, model size, data volume and distribution, and training strategies influence model performance. Due to the lack of in-depth studies on representations in large-scale TS models, we examined the evolution of representations from both inter-layer and intra-layer perspectives. Our analysis reveals that only a small subset of layers play a critical role in learning, while the majority contribute minimally—a phenomenon we term few-layer dominance. Building on the insight, we propose a method to identify critical layers, allowing models to achieve performance on par while improving inference efficiency. Validation on existing large-scale TS models confirms the universality of few-layer dominance and the reliability of critical layers identification method.
Primary Area: learning on time series and dynamical systems
Submission Number: 24832
Loading