TimeAutoDiff: A Unified Framework for Generation, Imputation, Forecasting, and Time-Varying Metadata Conditioning of Heterogeneous Time Series Tabular Data
Abstract: We present \texttt{TimeAutoDiff}, a unified latent–diffusion framework that addresses four fundamental time-series tasks—unconditional generation, missing-data imputation, forecasting, and time-varying-metadata conditional generation—within a single model that natively handles heterogeneous features (continuous, binary, and categorical).
We unify these tasks through a simple masked-modeling strategy: a binary mask specifies which time–feature cells are observed and which must be generated.
To make this work on mixed data types, we pair a lightweight variational autoencoder—which maps continuous, categorical, and binary variables into a continuous latent sequence—with a diffusion model that learns dynamics in that latent space, avoiding separate likelihoods for each data type while still capturing temporal and cross-feature structure.
Two design choices give \texttt{TimeAutoDiff} clear speed and scalability advantages.
First, the diffusion process samples a single latent trajectory for the full horizon \(1{:}T\) rather than denoising one timestep at a time; this whole-sequence sampling drastically reduces reverse-diffusion calls and yields an order-of-magnitude throughput gain.
Second, the VAE compresses along the feature axis, so very wide tables are modeled in a lower-dimensional latent space, further reducing computational load.
Across six real-world datasets, \texttt{TimeAutoDiff} matches or surpasses strong baselines in synthetic sequence fidelity (discriminative, temporal-correlation, and predictive metrics) and consistently lowers MAE/MSE for imputation and forecasting tasks.
Time-varying-metadata conditioning unlocks real-world scenario exploration: by editing metadata sequences (e.g., regime labels, environmental or policy indicators), practitioners can generate coherent families of counterfactual trajectories that track intended directional changes, preserve cross-feature dependencies, and remain conditionally calibrated—making ``what-if'' analysis practical.
Ablations attribute performance gains to whole-sequence sampling, latent compression, and mask conditioning, while a distance-to-closest-record audit indicates strong generalization with limited memorization.
Code implementations of \texttt{TimeAutoDiff} are provided in https://anonymous.4open.science/r/TimeAutoDiff-TMLR-7BA8/README.md.
Submission Length: Long submission (more than 12 pages of main content)
Assigned Action Editor: ~Mingsheng_Long2
Submission Number: 5543
Loading