Abstract: A number of recent works have employed decision trees for the construction of explainable partitions that aim to minimize the
$k$-means cost function. These works, however, largely ignore metrics related to the depths of the leaves in the resulting tree, which is perhaps surprising considering how the explainability of a decision tree depends on these depths. To fill this gap in the literature, we propose an efficient algorithm with a penalty term in its loss function to favor the construction of shallow decision trees – i.e., trees whose leaves are not very deep, which translate to clusters that are defined by a small number of attributes and are therefore easier to explain. In experiments on 16 datasets, our algorithm yields better results than decision-tree clustering algorithms recently presented in the literature, typically achieving lower or equivalent costs with considerably shallower trees.
0 Replies
Loading