Abstract: We propose a novel ensemble method called Riemann-Lebesgue Forest (RLF) for
regression. The core idea in RLF is to mimic the way how a measurable function
can be approximated by partitioning its range into a few intervals. With this idea
in mind, we develop a new tree learner named Riemann-Lebesgue Tree (RLT)
which has a chance to perform Lebesgue type cutting,i.e splitting the node from
response Y at certain non-terminal nodes. We show that the optimal Lebesgue
type cutting results in larger variance reduction in response Y than ordinary CART
[3] cutting (an analogue of Riemann partition). Such property is beneficial to
the ensemble part of RLF. We also generalize the asymptotic normality of RLF
under different parameter settings. Two one-dimensional examples are provided
to illustrate the flexibility of RLF. The competitive performance of RLF against
original random forest [2] is demonstrated by experiments in simulation data and
real world datasets
Loading