Abstract: In this work, we introduce a new approach to decision tree ensemble representation framework: instead of using a graph model we transform each tree into a well-known polynomial form. We apply the new representation to three tasks: theoretical analysis, model reduction, and interpretation. Polynomial form of the tree ensemble allows a straightforward interpretation of the original model. In our experiments, it shows comparable results with state-of-the-art interpretation techniques. Another application of the framework is the ensemble-wise pruning: we can drop monomials from the polynomial, based on train data statistics. This way we reduce the model size up to 3 times without loss of its quality. It is possible to show the equivalence of tree shape classes that share the same polynomial. This fact gives us the ability to train model in one tree's shape and exploit it in another, which is easier for computation or interpretation form. We formulate a problem statement for optimal tree ensemble translation from one form to another and build a greedy solution to this problem.
CMT Num: 7685
Code Link: Our framework was implemented as a part of JAVA-based library for ML experiments https://github.com/spbsu-ml-community/jmll and experiments were done with cli-version of it. The same framework was implemented as a part of CatBoost GPU version and several experiment were done with this code, sources are not publicly available yet, but we are planning to release for the time of the conference and it'll be available at CatBoost GitHub repo https://github.com/catboost/catboost
0 Replies
Loading