TL;DR: We introduce a fixed-point tree-growing algorithm that replaces the gradient-based splitting mechanism in GRFs, enabling faster and memory-efficient forest construction while preserving estimator accuracy.
Abstract: We propose a computationally efficient alternative to generalized random forests (GRFs) for estimating heterogeneous effects in large dimensions. While GRFs rely on a gradient-based splitting criterion, which in large dimensions is computationally expensive and unstable, our method introduces a fixed-point approximation that eliminates the need for Jacobian estimation. This gradient-free approach preserves GRF’s theoretical guarantees of consistency and asymptotic normality while significantly improving computational efficiency. We demonstrate that our method achieves a speedup of multiple times over standard GRFs without compromising statistical accuracy. Experiments on both simulated and real-world data validate our approach. Our findings suggest that the proposed method is a scalable alternative for localized effect estimation in machine learning and causal inference applications.
Lay Summary: In many fields, like medicine or public policy, it’s crucial to know not just whether something works, but who it works best for. Do patients have unique responses to a new drug? Do economic policies affect different communities in distinct ways? The method of Generalized Random Forests (GRFs) is a powerful method specifically tailored to measure these types of group-specific effects. However, GRFs can become bogged down in complex and time-consuming calculations when trying to estimate many effects at the same time.
Our idea is to replace the source of these expensive calculations with a much faster, streamlined approximation that sidesteps this problem altogether. Our "fixed-point tree" (FPT) approach is not only faster--often multiple times faster--but we show that its relative speed over GRF increases when applied to a larger number of effects. The key result is that both the original GRF approach and our faster GRF-FPT approach are about as accurate as one another, meaning that the speed that we offer doesn't come at the cost of accuracy. This makes the already powerful method of GRFs even more attractive for large-scale problems.
Link To Code: https://github.com/dfleis/grf-experiments
Primary Area: General Machine Learning->Supervised Learning
Keywords: Generalized Random Forests, Fixed-Point Methods, Ensemble Methods, Causal Inference, Computational Efficiency
Submission Number: 14206
Loading