When Bigger isn't Better: The Role of Model-Data Complexity in Time Series Forecasts

ICLR 2026 Conference Submission20549 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: time series forecasting, foundation model, benchmark
Abstract: Large, over-parameterized models have become the dominant paradigm in machine learning, with foundation models claiming universal applicability across diverse tasks such as time series forecasting. Yet, it remains unclear how such models behave across the full spectrum of data complexity- as estimated by the complexity of the generative processes that produce them. In this work, we show that large foundation models often struggle with forecasting on simple data. We propose a systematic benchmarking approach to evaluate models at multiple levels of complexity, from classic statistical to foundation models, against datasets spanning from simple, deterministic patterns to highly stochastic processes. By evaluating models that range from classic statistical methods (e.g., ARIMA) through mid‑complexity deep networks to large foundational models, we show that model effectiveness depends jointly on model complexity and data complexity. Simpler, structured datasets often favor lower‑capacity or classical methods, while complex, noisy datasets generally benefit from higher‑capacity machine learning models. These results highlight the importance of task-specific model selection, balancing data and model complexity. In contrast, foundation models often fail on simple signals where inductive bias and parsimonious modeling are sufficient. These findings show that "bigger" is not inherently "better", reaffirming the classical approximation–estimation trade-offs in the zero-shot setting, and underscore the need for data-aware model selection rather than one-size-fits-all deployment.
Primary Area: learning on time series and dynamical systems
Submission Number: 20549
Loading