Riemann-Lebesgue Forest for Regression

Tian Qin; Wei-Min Huang

Riemann-Lebesgue Forest for Regression

Tian Qin, Wei-Min Huang

Published: 27 Jun 2025, Last Modified: 27 Jun 2025Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: We propose a novel ensemble method called Riemann-Lebesgue Forest (RLF) for regression. The core idea in RLF is to mimic the way how a measurable function can be approximated by partitioning its range into a few intervals. With this idea in mind, we develop a new tree learner named Riemann-Lebesgue Tree (RLT) which has a chance to perform ``Lebesgue'' type cutting,i.e., splitting the node from response Y at certain non-terminal nodes. In other words, we introduce the ``splitting type randomness'' in training our ensemble method. Since the information of Y is unavailable in the prediction step, weak local models such as small random forests or decision trees are fit in non-terminal nodes with ``Lebesgue'' type cutting to determine which child node should we proceed to. We show that the optimal ``Lebesgue'' type cutting results in larger variance reduction in response Y than ordinary CART cutting (an analogue of Riemann partition) in fitting a base tree. Such property is beneficial to the ensemble part of RLF, which is verified by extensive experiments. We also establish the asymptotic normality of RLF under different parameter settings. Two one-dimensional examples are provided to illustrate the flexibility of RLF. The competitive performance of RLF with small local random forests against original random forest (RF) and boosting methods such as XGboost is demonstrated by extensive experiments in simulation data and real-world datasets. Additional experiments further illustrate that RLF with local decision trees could achieve decent performance comparable to that of RF with less running time, especially in large datasets.

Submission Length: Long submission (more than 12 pages of main content)

Changes Since Last Submission: This is the camera ready version. Codes are in supplementary material.

Supplementary Material: zip

Assigned Action Editor: ~Benjamin_Guedj1

Submission Number: 4191

Loading