Beyond Greedy: Towards Optimal Deep Classification Trees

16 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: decision tree.+CART.+optimal classification trees
Abstract: Decision trees are central to interpretable machine learning but face severe scalability challenges. Existing global optimal methods are limited by binary feature selection and shallow tree depths, while traditional heuristic approaches often sacrifice accuracy. To overcome these limitations, this paper introduces a moving-horizon approximate branch-and-reduce method for constructing near-optimal deep classification trees on large-scale datasets with continuous features. This method is based on a bilevel optimization framework, where the upper-level problem is addressed using a branch-and-reduce method, while the lower-level problem is solved recursively. Although the underlying framework guarantees global optimality, we enhance its efficiency for deeper trees by introducing an approximate solution for the lower-level problem, which can be viewed as a lookahead rollout in reinforcement learning. The accuracy is further refined using a low-cost moving-horizon strategy. Extensive experiments demonstrate that the proposed method consistently outperforms existing heuristic baselines in testing accuracy, while maintaining scalability on large datasets compared to global optimal methods.
Supplementary Material: zip
Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Submission Number: 8113
Loading