Differentially Private XGBoost Revisited: Is Random Decision Trees Really Better than Greedy Ones?

TMLR Paper6073 Authors

03 Oct 2025 (modified: 29 Nov 2025)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Boosted Decision Trees (e.g., XGBoost) are one of the strongest and most widely used machine learning models. Motivated by applications in sensitive domains, various versions of Boosted Decision Tree learners with provably differential privacy (DP) guarantees were designed. Contrary to their non-private counterparts, a recent study shows that private boosting random decision trees outperform a more faithful privatization of XGBoost that uses greedy decision trees. In this paper, we challenge this conclusion with an improved DP-XGBoost algorithm and a thorough empirical study. Our results indicate that, although random selection remains strong on most datasets, our improved DP analysis narrows down the performance gap between random and greedy selection. At the same time, we identify regimes in which greedy selection clearly outperforms: when the number of trees is restricted to be small (e.g., for interpretability) or when interaction terms play a key role in prediction, random selection can suffer degradation in accuracy, whereas our DP-XGB greedy strategy remains robust and achieves better performance.
Submission Length: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Zhanyu_Wang1
Submission Number: 6073
Loading