Beyond Accuracy: Are Time Series Foundation Models Well-Calibrated?

Coen Adler; Yuxin Chang; Samar Abdi; Felix Draxler; Padhraic Smyth

Beyond Accuracy: Are Time Series Foundation Models Well-Calibrated?

Coen Adler, Yuxin Chang, Samar Abdi, Felix Draxler, Padhraic Smyth

Published: 26 Jan 2026, Last Modified: 11 Apr 2026ICLR 2026 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Time Series, Foundation Models, Calibration, Confidence

TL;DR: We evaluate model calibration of time series foundation models and find that they are generally well-calibrated.

Abstract: The recent development of foundation models for time series data has generated considerable interest in using such models across a variety of applications. Although foundation models achieve state-of-the-art predictive performance, their calibration properties remain relatively underexplored, despite the fact that calibration can be critical for many practical applications. In this paper, we investigate the calibration-related properties of five recent time series foundation models and two competitive baselines. We perform a series of systematic evaluations assessing model calibration (i.e., over- or under-confidence), effects of varying prediction heads, and calibration under long-term autoregressive forecasting. We find that time series foundation models are consistently better calibrated than baseline models and tend not to be either systematically over- or under-confident, in contrast to the overconfidence often seen in other deep learning models.

Primary Area: foundation or frontier models, including LLMs

Submission Number: 9332

Loading