GIFT-Eval: A Benchmark for General Time Series Forecasting Model Evaluation

Published: 10 Oct 2024, Last Modified: 26 Nov 2024NeurIPS 2024 TSALM WorkshopEveryoneRevisionsBibTeXCC BY 4.0
Keywords: time series forecasting, benchmarking, foundation models, forecasting, univariate forecasting, multivariate forecasting, pretraining data, deep learning, statistical models, dataset
TL;DR: We introduce a new benchmark designed specifically for time series foundation models, featuring diverse time series characteristics and structured with pretrain, train, and test splits.
Abstract: The development of time series foundation models has been constrained by the absence of comprehensive benchmarks. This paper introduces the **G**eneral T**I**me Series **F**orecas**T**ing Model **Eval**uation, GIFT-Eval, a pioneering benchmark specifically designed to address this gap. GIFT-Eval encompasses 28 datasets with over 144,000 time series and 157 million observations, spanning seven domains and featuring a variety of frequencies, number of variates and prediction lengths from short to long-term forecasts. Our benchmark facilitates the effective pretraining and evaluation of foundation models. We present a detailed analysis of 12 baseline models, including statistical, deep learning, and foundation models. We further provide a fine-grained analysis for each model across different characteristics of our benchmark. We hope that insights gleaned from this analysis along with the access to this new standard zero-shot time series forecasting benchmark shall guide future developments in time series forecasting foundation models.
Submission Number: 84
Loading