Online Mixed Missing Value Imputation Using Gaussian CopulaDownload PDF

Published: 06 Jul 2020, Last Modified: 05 May 2023ICML Artemiss 2020Readers: Everyone
TL;DR: Performant online and minibatch algorithms for mixed missing imputation using Gaussian Copula
Keywords: mixed data, ordinal data, Gaussian copula, missing values, imputation, online
Abstract: Many data science algorithms require complete observations, making missing value imputation an important step in many data processing pipelines. Imputation is also of independent interest for applications such as recommender systems. To address real-world big data problems, imputation algorithms must handle mixed data, containing ordinal, boolean, and continuous variables, and such algorithms must be highly scalable. In this work we develop a semi-parametric online algorithm for mixed missing value imputation using a Gaussian Copula. This online algorithm improves on the speed of its offline counterpart by an order of magnitude, with similar accuracy. The online method can also improve on the offline method by adapting to a changing data distribution.
2 Replies

Loading