TL;DR: We propose Missingness Avoiding (MA)-class, a framework that adds classifier-specific regularization to trees, ensembles, and sparse linear models, reducing reliance on missing values while preserving accuracy and interpretability.
Abstract: Handling missing values at test time is challenging for machine learning models, especially when aiming for both high accuracy and interpretability. Established approaches often add bias through imputation or excessive model complexity via missingness indicators. Moreover, either method can obscure interpretability, making it harder to understand how the model utilizes the observed variables in predictions. We propose *missingness-avoiding* (MA) machine learning, a general framework for training models to rarely require the values of missing (or imputed) features at test time. We create tailored MA learning algorithms for decision trees, tree ensembles, and sparse linear models by incorporating classifier-specific regularization terms in their learning objectives. The tree-based models leverage contextual missingness by reducing reliance on missing values based on the observed context. Experiments on real-world datasets demonstrate that **MA-DT, MA-LASSO, MA-RF**, and **MA-GBT** effectively reduce the reliance on features with missing values while maintaining predictive performance competitive with their unregularized counterparts. This shows that our framework gives practitioners a powerful tool to maintain interpretability in predictions with test-time missing values.
Lay Summary: Machine learning models often struggle when values are missing at test time—a common issue in real-world applications like healthcare or finance. Existing solutions either fill in missing values (which can introduce bias) or make models more complex (which can reduce interpretablity). This makes it harder to understand and trust model decisions.
We tackle this problem by developing missingness-avoiding (MA) machine learning—a new framework that encourages models to minimize reliance on missing features during prediction. We designed custom versions of this method for several models, including decision trees, random forests, gradient boosting, and sparse linear models. Our approach includes tailored regularization that encourages models to base decisions on observed inputs rather than imputed or missing data.
In experiments across several real-world datasets, our models remained accurate while becoming significantly less dependent on missing features. This work matters because it allows practitioners to make accurate, interpretable predictions, even when some values are missing, offering safer, more trustworthy ML tools for critical domains.
Link To Code: https://github.com/Healthy-AI/malearn
Primary Area: General Machine Learning->Supervised Learning
Keywords: Missing values, interpretablity, boosting, decision trees, rule-based methods
Submission Number: 6874
Loading