Constant Time Decision Trees and Random Forest

Maddimsetti Srinivas, Debdoot Sheet

Published: 01 Jan 2024, Last Modified: 26 Dec 2024ICPR (5) 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: The time complexity during inference with a classically implemented binary decision tree (BDT) is stochastic and bound by depth. The lower and upper bounds depend on the shortest and tallest height of leaf node, respectively. This stochastic nature challenges inference with BDT and random forests (RF) when using streaming data arriving at a fixed rate, where constant time complexity algorithms are preferred. The proposed method represents BDT as a function of Boolean variables in order to achieve constant time complexity at inference. Each decision node in the BDT is represented as a Boolean variable, and such a structure is referred to as a Boolean decision structure (BDS). It has a time complexity of O(N) where N is the number of Boolean variables. When decision nodes with approximately similar decision boundaries are grouped, it further reduces the number of Boolean variables, thus creating an optimized BDS (OBDS). Experimentally, BDS and OBDS render statistically equivalent performance convergent on BDT and BDT-based RF while featuring a reduced and constant time complexity at inference, which does not vary with the depth of BDT and the number of BDTs in an RF.