Enhancing Foundation Models for Time Series Forecasting via Wavelet-based Tokenization

Luca Masserano; Abdul Fatir Ansari; Boran Han; Xiyuan Zhang; Christos Faloutsos; Michael W. Mahoney; Andrew Gordon Wilson; Youngsuk Park; Syama Sundar Rangapuram; Danielle C. Maddix; Bernie Wang

Enhancing Foundation Models for Time Series Forecasting via Wavelet-based Tokenization

Luca Masserano, Abdul Fatir Ansari, Boran Han, Xiyuan Zhang, Christos Faloutsos, Michael W. Mahoney, Andrew Gordon Wilson, Youngsuk Park, Syama Sundar Rangapuram, Danielle C. Maddix, Bernie Wang

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

TL;DR: We develop a wavelet-based tokenizer and pretrain a foundation model for time series forecasting on time-localized frequencies. Results show excellent generalization performance and superior ability to capture complex patterns of practical relevance.

Abstract: How to best develop foundational models for time series forecasting remains an important open question. Tokenization is a crucial consideration in this effort: what is an effective discrete vocabulary for a real-valued sequential input? To address this question, we develop WaveToken, a wavelet-based tokenizer that allows models to learn complex representations directly in the space of time-localized frequencies. Our method first scales and decomposes the input time series, then thresholds and quantizes the wavelet coefficients, and finally pre-trains an autoregressive model to forecast coefficients for the forecast horizon. By decomposing coarse and fine structures in the inputs, wavelets provide an eloquent and compact language for time series forecasting that simplifies learning. Empirical results on a comprehensive benchmark, including 42 datasets for both in-domain and zero-shot settings, show that WaveToken: i) performs on par or better than recently proposed foundation models for forecasting while using a much smaller vocabulary (1024 tokens), and is competitive with modern deep learning models trained specifically on each dataset; ii) exhibits superior generalization capabilities, achieving the best average rank across all datasets for three complementary metrics; and iii) easily captures complex temporal patterns of practical relevance that are challenging for other recent pre-trained models, including trends, sparse spikes, and non-stationary time series with varying frequencies evolving over time.

Lay Summary: Predicting future events from past data (time series forecasting) is crucial in many fields, from finance to climate science. While powerful AI models (referred to as "foundation models") excel at language, adapting them to understand continuous time series data is challenging. A key hurdle is "tokenization": finding an effective way to convert real-valued sequences into a discrete "language" these models can process, especially one that captures both broad trends and sudden changes efficiently. This paper introduces WaveToken, a novel method that uses wavelets—mathematical tools that break down signals into different frequencies at specific times. WaveToken first decomposes the time series into these wavelet components, then simplifies and converts them into a compact set of "tokens". A foundation model is then trained to predict these future wavelet tokens. WaveToken performs as well as or better than existing advanced forecasting models, even those specifically trained for individual datasets, but with a significantly smaller vocabulary (fewer "words"). This compactness helps it generalize better to unseen data. Crucially, WaveToken can accurately capture complex patterns like trends, sudden spikes, and signals with evolving frequencies—areas where other models often falter, making it a promising approach for more robust and efficient time series forecasting.

Primary Area: Deep Learning->Sequential Models, Time series

Keywords: time series, forecasting, foundation models, wavelets, tokenization, frequency, pretrained model, nonstationary

Submission Number: 7667

Loading