Multi-view Latent Diffusion Reconstruction for Vision-enhanced Time Series Forecasting

Weilin Ruan; Siru Zhong; Haomin Wen; Yuxuan Liang

Multi-view Latent Diffusion Reconstruction for Vision-enhanced Time Series Forecasting

Weilin Ruan, Siru Zhong, Haomin Wen, Yuxuan Liang

18 Sept 2025 (modified: 19 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: time series forecasting, diffusion models, multi-modal learning

TL;DR: We propose LDM4TS, a vision-enhanced time series forecasting framework that transforms sequences into visual represetations and leverages multi-modal conditional-guided latent diffusion models for time series forecasting.

Abstract: Recent studies have explored diffusion models for time series forecasting, yet most methods operate directly on 1D signals and tend to overlook intrinsic temporal structures (e.g., periodicity and trend). This often leads to suboptimal long-range dependency modeling and poorly calibrated uncertainty. To this end, we propose LDM4TS, a vision-enhanced time series forecasting framework that visualizes time series into structured 2D representations and leverages the image reconstruction capabilities of diffusion models. Raw sequences are first converted into complementary visual inputs, forming multiple views that collectively capture diverse temporal structures. By leveraging the generative nature of the diffusion process, the framework not only yields accurate point forecasts but also provides the capability to characterize predictive uncertainty. Extensive experiments demonstrate that LDM4TS outperforms various specialized forecasting models for time series forecasting tasks.

Primary Area: learning on time series and dynamical systems

Submission Number: 12141

Loading