Leveraging Predictive Equivalence in Decision Trees

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
TL;DR: Decision trees’ representations are not unique; we use this to improve missingness handling, variable importance, and cost optimization.
Abstract: Decision trees are widely used for interpretable machine learning due to their clearly structured reasoning process. However, this structure belies a challenge we refer to as predictive equivalence: a given tree's decision boundary can be represented by many different decision trees. The presence of models with identical decision boundaries but different evaluation processes makes model selection challenging. The models will have different variable importance and behave differently in the presence of missing values, but most optimization procedures will arbitrarily choose one such model to return. We present a boolean logical representation of decision trees that does not exhibit predictive equivalence and is faithful to the underlying decision boundary. We apply our representation to several downstream machine learning tasks. Using our representation, we show that decision trees are surprisingly robust to test-time missingness of feature values; we address predictive equivalence's impact on quantifying variable importance; and we present an algorithm to optimize the cost of reaching predictions.
Lay Summary: Decision trees are a popular machine learning tool because they explain decisions with a simple flowchart structure. To obtain predictions from these trees, common practice is to follow flowcharts from the top down. However, in many cases, there are a variety of flowcharts which make equivalent predictions (hence the name, "predictive equivalence"), even though they present different top-down decision paths. To solve this, we introduced a way to represent decision trees using Boolean logic. This representation keeps the prediction behavior the same but removes the confusion caused by having many versions of the same model. It also has some surprising benefits: it helps decision trees handle missing data better than expected; gives a more accurate view of which features are important; and we introduce a way to reduce the cost of making predictions with a given tree.
Link To Code: https://github.com/HaydenMcT/predictive-equivalence
Primary Area: Social Aspects->Accountability, Transparency, and Interpretability
Keywords: Decision Tree, Interpretability, Machine Learning, Missing Data, Variable Importance, Cost Optimization
Submission Number: 13085
Loading