HDDI: A Historical Data-Based Diffusion Imputation Method for High-Accuracy Recovery in Multivariate Time Series with High Missing Rate and Long-Term Gap
Multivariate time series data often face the challenge of missing values, which can impact the performance of subsequent tasks. Although some deep learning-based imputation methods perform well, they still struggle with insufficient training data due to high missing rate and long-term missing data. To address these challenges, we propose a Historical Data-based Multivariate Time Series Diffusion Imputation (HDDI) method. Unlike existing deep learning-based imputation methods, we design a historical data supplement module to match and fuse historical data to supplement the training data. Additionally, we propose a diffusion imputation module that utilizes the supplement training data to achieve high-accuracy imputation even under high missing rate and long-term missing scenario. We conduct extensive experiments on five public multivariate time series datasets, the results show that our HDDI outperforms baseline methods across five datasets. Particularly, when the data missing rate is 90%, HDDI improves accuracy by 25.15% compared to the best baseline method in the random missing scenario, and by 13.64% in the long-term missing scenario. The code is available at https://github.com/liuyu3880/HDDI project.