TimeAutoDiff: A Unified Framework for Generation, Imputation, Forecasting, and Time-Varying Metadata Conditioning of Heterogeneous Time Series Tabular Data
Abstract: We present \texttt{TimeAutoDiff}, a unified latent-diffusion framework that addresses four fundamental time-series tasks—unconditional generation, missing-data imputation, forecasting, and time-varying-metadata conditional generation—within a single model that natively handles heterogeneous features (continuous, binary, and categorical). We unify these tasks through a simple masked-modeling strategy: a binary mask specifies which time feature cells are observed and which must be generated. To make this work on mixed data types, we pair a lightweight variational autoencoder (i.e., VAE)—which maps continuous, categorical, and binary variables into a continuous latent sequence—with a diffusion model that learns dynamics in that latent space, avoiding separate likelihoods for each data type while still capturing temporal and cross-feature structure.
Two design choices give \texttt{TimeAutoDiff} clear speed and scalability advantages. First, the diffusion process samples a single latent trajectory for the full time horizon rather than denoising one timestep at a time ; this whole-sequence sampling drastically reduces reverse-diffusion calls and yields an order-of-magnitude throughput gain. Second, the VAE compresses along the feature axis, so very wide tables are modeled in a lower-dimensional latent space, further reducing computational load.
Empirical evaluation demonstrates that \texttt{TimeAutoDiff} matches or surpasses strong baselines in synthetic sequence fidelity (discriminative, temporal-correlation, and predictive metrics) and consistently lowers MAE/MSE for imputation and forecasting tasks. Time-varying metadata conditioning unlocks real-world scenario exploration: by editing metadata sequences, practitioners can generate coherent families of counterfactual trajectories that track intended directional changes, preserve cross-feature dependencies, and remain conditionally calibrated—making "what-if" analysis practical.
Our ablation studies confirm that performance is impacted by key architectural choices, such as the VAE's continuous feature encoding and specific components of the DDPM denoiser . Furthermore, a distance-to-closest-record (DCR) audit demonstrates that the model achieves generalization with limited memorization given enough dataset.
Code implementations of \texttt{TimeAutoDiff} are provided in https://anonymous.4open.science/r/TimeAutoDiff-TMLR-7BA8/README.md.
Submission Length: Long submission (more than 12 pages of main content)
Assigned Action Editor: ~Mingsheng_Long2
Submission Number: 5543
Loading