Tree comprehensive diversity-based pruned deep forest

Jiaman Ding, Xiangyu Jiang, Lianyin Jia, Xiaodong Fu, Ying Jiang

Published: 2025, Last Modified: 23 Jan 2026Int. J. Mach. Learn. Cybern. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Deep Forest (DF) is proposed as a deep learning model based on non-differentiable modules, which extends the deep learning paradigm by incorporating non-neural network structures and has demonstrated remarkable performance in several fields. DF utilizes a cascade structure where each layer contains one or more sets of random forests. However, some of these forests may include decision trees with high redundancy and suboptimal performance. Additionally, the feature representations in DF are composed of prediction class probability vectors, and these prediction-based representations require the retention of a large amount of information about the forest models for subsequent testing. These two characteristics of DF make it prone to issues related to high time, memory, and storage costs, and may even lead to performance degradation. To address these challenges, we design a novel pruning strategy aimed at optimizing random forests in cascade layers and propose a new deep forest model called Tree Comprehensive Diversity-based Pruned Deep Forest (TCDPDF). Unlike most methods that rely on model prediction results to measure diversity, we explore a diversity measure based on the morphological structure of decision trees. Furthermore, we propose a tree comprehensive diversity measure for the pruning strategy by balancing both structural diversity and prediction accuracy of decision trees. We conduct experiments on 12 public datasets, and the results demonstrate that TCDPDF achieves competitive performance, significantly reduces the model’s high computational complexity, and further enhances its generalization ability.

External IDs:dblp:journals/mlc/DingJJFJ25