Self-attention-based Diffusion Model for Time-series Imputation in Partial Blackout Scenarios

TMLR Paper2106 Authors

27 Jan 2024 (modified: 31 May 2024)Rejected by TMLREveryoneRevisionsBibTeX
Abstract: Missing values are a common phenomenon in multivariate time series data, capable of harming the performance of machine learning models and introducing bias and inaccuracies into further analysis. These gaps typically arise from various sources, including sensor malfunctions, extreme events like blackouts, and human error. Previous work has made promising strides in imputation for time series data. However, they mostly dealt with some selective cases of missing patterns such as - missing at random, missing due to complete blackout (all features are missing for a given period of time), and forecasting. In this paper, we delve into a more general category of missing patterns, which we call \textbf{partial blackout}, wherein a subset of features remain missing for one or several consecutive time steps. This describes a more natural scenario that is frequently encountered in real-world applications and covers the aforementioned patterns as special cases. We introduce a two-stage imputation process that explicitly models the feature and temporal correlations with the help of self-attention and diffusion processes. Notably, our model outperforms the state-of-the-art models when dealing with general partial blackout scenarios and exhibits greater scalability, offering promise for practical data imputation needs. The code and the synthetic experiments are here: \hyperref[https://anonymous.4open.science/r/SADI-official-repository-3853/README.md]{https://anonymous.4open.science/r/SADI-official-repository-3853/README.md}.
Submission Length: Long submission (more than 12 pages of main content)
Changes Since Last Submission: We made some changes according to the reviews. We also added some extra results in Section 5.2 to show the performance comparison of our model with CSDI when we differ in the number of independent and interdependent features. Here is a list of other changes we made: (1) In Section - 2, added the Bayesian latent models in the "Generative Models" part and added more details to "Other Methods" (2) Made some changes according to the reviews in Section - 4 intro part. Made changes to Section - 4.1,4.2,4.3,4.4.
Assigned Action Editor: ~Makoto_Yamada3
Submission Number: 2106
Loading