Keywords: Neural Tangent Kernel, Tree Ensemble, Soft Tree
TL;DR: We apply an axis-aligned constraint to differentiable soft trees and introduce their Neural Tangent Kernel (NTK), which allows us to describe the training behavior analytically.
Abstract: *Axis-aligned rules* are known to induce an important inductive bias in machine learning models such as typical hard decision tree ensembles. However, theoretical understanding of the learning behavior is largely unrevealed due to the discrete nature of rules. To address this issue, we impose the axis-aligned constraint on *differentiable* decision trees, or *soft trees*, which relax the splitting process of decision trees and are trained using the gradient method. The differentiable property enables us to derive their *Neural Tangent Kernel* (NTK) that can analytically describe the training behavior. Two cases are realized: imposing the axis-aligned constraint throughout the entire training process, or only at the initial state. Moreover, we extend the NTK framework to handle various tree architectures simultaneously, and prove that any axis-aligned non-oblivious tree ensemble can be transformed into an axis-aligned oblivious tree ensemble with the same limiting NTK. By excluding non-oblivious trees from the search space, the cost of trial-and-error procedures required for model selection can be massively reduced.
Submission Number: 13
Loading