Speaking Numbers to LLMs: Multi-Wavelet Number Embeddings for Time Series Forecasting

Defu Cao; Zijie Lei; Jiao Sun; Yan Liu

Speaking Numbers to LLMs: Multi-Wavelet Number Embeddings for Time Series Forecasting

Defu Cao, Zijie Lei, Jiao Sun, Yan Liu

Published: 09 Jun 2025, Last Modified: 03 Jul 2025FMSD @ ICML 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Numerical Understanding, LLMs, Time Series, Forecasting Task, Embedding

TL;DR: Bridging statistical signal processing principles with natural language understanding, creating a cross-domain framework that enhances prediction capabilities beyond traditional disciplinary boundaries.

Abstract: Large language models (LLMs) struggle with time series analysis due to the numerical nature of temporal data, which conflicts with their text-focused pre-training and tokenization that can disrupt temporal patterns. To address this, we introduce Multi-Wavelet Number Embedding (MWNE), a novel technique using wavelet theory to decompose numerical values and effectively capture multi-scale temporal features. Theoretically, MWNE bridges this modality gap by ensuring digit recovery, numeracy preservation, enhanced discriminability through multi-scale wavelets, and robustness to normalization, effectively providing LLMs with a numerically sound "language of numbers" for more natural time series processing. Our empirical results support this theoretical framework, with extensive evaluations demonstrating that MWNE-augmented LLMs significantly outperform baselines on diverse forecasting benchmarks, often matching or exceeding specialized time series models.

Submission Number: 98

Loading