Abstract: Intelligent techniques, including artificial intelligence and deep learning, normally perform on complete data without missing data. Multiple imputation is indispensable for addressing missing data resulting in unbiased estimates and dealing with uncertainty by providing more valid results. Most state-of-the-art techniques focus on high-missing rates (around 50%–60%) and short missing gaps, while imputation for extreme missing gaps and missing rates is an important challenge for multivariate time-series data generated through the Internet of Things (IoT). Hence, we propose an lightweight-window-portion-based multiple imputation (LWPMI) based on multivariate variables, correlation, data fusion, regression, and multiple imputations. We conduct extensive experiments by generating extreme missing gaps and high-missing rates ranging from 10% to 90% on data generated by sensors. We also investigate different sets of feature to examine how LWPMI works when features have high, weak, or a mixture of high and weak correlation. All the obtained results prove LWPMI outperforms baseline techniques in preserving pattern, structure, and trend in both 90% extreme missing gap and missing rates.
Loading