Explainable models via compression of tree ensembles

Siwen Yan, Sriraam Natarajan, Saket Joshi, Roni Khardon, Prasad Tadepalli

Published: 2024, Last Modified: 12 May 2025Mach. Learn. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Ensemble models (bagging and gradient-boosting) of relational decision trees have proved to be some of the most effective learning methods in the area of probabilistic logic models (PLMs). While effective, they lose one of the most important benefits of PLMs—interpretability. In this paper we consider the problem of compressing a large set of learned trees into a single explainable model. To this effect, we propose CoTE—Compression of Tree Ensembles—that produces a single small decision list as a compressed representation. CoTE first converts the trees to decision lists and then performs the combination and compression with the aid of the original training set. An experimental evaluation demonstrates the effectiveness of CoTE in several benchmark relational data sets.