Time Series as Videos: Spectro-Temporal Generative Diffusion

ICLR 2026 Conference Submission18914 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Time-Series Generation, Unconditional Generation, Video Diffusion, Time Frequency Plane
TL;DR: ST-Diff achieves SOTA time series generation by reframing data as video (via STFT) and training a custom diffusion model on its spectro-temporal dynamics.
Abstract: Generative modeling of multivariate time series is challenged by properties such as non-stationarity, intricate cross-channel correlations, and multi-scale temporal dependencies. Existing diffusion models for this task mainly operate directly in the time-domain, employing architectures that are not designed to capture complex spectral dynamics. Conversely, methods that transform sequences into static images collapse the temporal axis, precluding the use of models designed for spatiotemporal dynamics. This paper argues for a new, unifying paradigm: reframing time series as videos. To this aim, we introduce Spectro-Temporal Diffusion (ST-Diff), a framework that first leverages the Short-Time Fourier Transform (STFT) to convert a multivariate time series into a time-frequency video tensor. In this representation, frequency and covariate axes form the spatial dimensions of each frame, while the temporal evolution of the frequency spectrum is explicitly preserved. To capitalize on this novel structure, we design and implement a custom video diffusion model specifically architected to leverage the spectro-temporal dynamics — the evolution of frequency components over time. Through extensive empirical evaluation on standard benchmarks, we demonstrate that this synergistic approach of a novel representation and a tailored architecture allows ST-Diff to establish a new state-of-the-art in unconditional time series generation. We argue that this time-series-as-video paradigm has significant potential to advance a broad spectrum of sequence modeling tasks beyond unconditional time-series generation.
Supplementary Material: zip
Primary Area: generative models
Submission Number: 18914
Loading