Unified Generative Modeling for Multimodal Time Series Analysis

14 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Time series; Multimodal; Unified generative modeling; Multi-task
TL;DR: GenTS is a unified generative model that jointly learns from time series and auxiliary modalities like text, enabling diverse tasks such as generation, forecasting, and editing while alleviating multimodal data scarcity.
Abstract: Modeling multimodal time series has become an emerging research focus, aiming to incorporate auxiliary modalities, such as textual descriptions, into time series analysis. This integration enables a deeper understanding of temporal patterns by leveraging diverse sources of information. However, existing approaches often treat external modalities merely as supplementary domain features, neglecting the joint distribution between time series and auxiliary modalities. Moreover, most prior methods are tailored to specific tasks, limiting their generality and the effective utilization of multimodal data. In this paper, we propose GenTS, a unified generative model for multimodal time series analysis, integrating a variety of downstream tasks within a unified modeling framework. GenTS is trained to generate time series from textual descriptions and to forecast future values conditioned on historical multimodal data, simultaneously. This approach enables the model to capture the joint distribution between time series and external modalities, supporting a broad range of applications such as conditional generation, forecasting, and time series editing. Furthermore, by incorporating time series captioning as an integral component, GenTS has largely alleviated the common challenge of multimodal data scarcity. Extensive experiments on diverse real-world datasets demonstrate the effectiveness and generality of our approach across multiple tasks.
Primary Area: learning on time series and dynamical systems
Submission Number: 4941
Loading