Knockout: A simple way to handle missing inputs

Minh Nguyen; Batuhan K. Karaman; Heejong Kim; Alan Q. Wang; Fengbei Liu; Mert R. Sabuncu

Knockout: A simple way to handle missing inputs

Minh Nguyen, Batuhan K. Karaman, Heejong Kim, Alan Q. Wang, Fengbei Liu, Mert R. Sabuncu

Published: 19 Jul 2025, Last Modified: 19 Jul 2025Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: Deep learning models benefit from rich (e.g., multi-modal) input features. However, multimodal models might be challenging to deploy, because some inputs may be missing at inference. Current popular solutions include marginalization, imputation, and training multiple models. Marginalization achieves calibrated predictions, but it is computationally expensive and only feasible for low dimensional inputs. Imputation may result in inaccurate predictions, particularly when high-dimensional data, such as images, are missing. Training multiple models, where each model is designed to handle different subsets of inputs, can work well but requires prior knowledge of missing input patterns. Furthermore, training and retaining multiple models can be costly. We propose an efficient method to learn both the conditional distribution using full inputs and the marginal distributions. Our method, Knockout, randomly replaces input features with appropriate placeholder values during training. We provide a theoretical justification for Knockout and show that it can be interpreted as an implicit marginalization strategy. We evaluate Knockout across a wide range of simulations and real-world datasets and show that it offers strong empirical performance.

Submission Length: Regular submission (no more than 12 pages of main content)

Code: https://github.com/mnhng/knockout

Supplementary Material: pdf

Assigned Action Editor: ~David_Rügamer1

Submission Number: 4468

Loading