Keywords: Imputation, Generative Models
Abstract: Missing data frequently arises across diverse domains, including time-series and image domains. In the real world, missing occurrences often depend on the unobservable values themselves, which are referred to as Missing Not at Random (MNAR). To address this, numerous generative models have been proposed, with diffusion models in particular demonstrating strong capabilities in out-of-sample imputation. However, most existing diffusion-based imputation approaches overlook the MNAR setting and instead rely on restrictive assumptions about the missing process, thereby limiting their applicability to practical scenarios. In this work, we introduce the Missing Pattern Recognized Diffusion Imputation Model (PRDIM), a novel framework that explicitly captures the missing pattern and precisely imputes unobserved values. PRDIM iteratively maximizes the likelihood of the joint distribution for observed values and missing mask under an Expectation-Maximization (EM) algorithm. In this sense, we first employ a pattern recognizer, which approximates the underlying missing pattern and provides guidance during every inference toward more plausible imputations with respect to the missing information. In various experimental settings, we demonstrate that PRDIM achieves the state-of-the-art performance compared to previous diffusion imputation approaches under MNAR setting.
Supplementary Material: zip
Primary Area: generative models
Submission Number: 24800
Loading