JUMP: JOINTLY UTILIZING MISSINGNESS FOR PREDICTION ON INCOMPLETE TABULAR DATA

17 Sept 2025 (modified: 13 Jan 2026)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: tabular learning; tabular embedding;mask autoender
Abstract: Impute-then-predict is the default for tabular data with missing values, yet optimizing reconstruction of imputation rarely guarantees downstream gains and induces distribution shift when train–test missingness differs. We present JUMP, an end-to-end missingness-aware framework that jointly optimizes imputation and prediction. JUMP re-masks a subset of observed features as reconstruction targets, shares a single encoder between reconstruction and prediction heads, and explicitly injects missingness indicators to fuse pattern cues with raw features. This design transforms imputation from a standalone preprocessing step into a training signal that directly serves the predictive objective, acting as a lightweight regularizer that stabilizes representations under missingness. Extensive experiments on eight benchmarks show that JUMP achieves state-of-the-art performance, consistently outperforming twelve impute-then-predict pipelines, strong tree-based models, and advanced neural architectures across diverse missingness mechanisms and challenging out-of-distribution settings.
Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Submission Number: 8293
Loading