Hyperbolic Random Forests

Published: 06 Jun 2024, Last Modified: 06 Jun 2024Accepted by TMLREveryoneRevisionsBibTeX
Abstract: Hyperbolic space is becoming a popular choice for representing data due to the hierarchical structure - whether implicit or explicit - of many real-world datasets. Along with it comes a need for algorithms capable of solving fundamental tasks, such as classification, in hyperbolic space. Recently, multiple papers have investigated hyperbolic alternatives to hyperplane-based classifiers, such as logistic regression and SVMs. While effective, these approaches struggle with more complex hierarchical data. We, therefore, propose to generalize the well-known random forests to hyperbolic space. We do this by redefining the notion of a split using horospheres. Since finding the globally optimal split is computationally intractable, we find candidate horospheres through a large-margin classifier. To make hyperbolic random forests work on multi-class data and imbalanced experiments, we furthermore outline new methods for combining classes based on the lowest common ancestor and class-balanced large-margin losses. Experiments on standard and new benchmarks show that our approach outperforms both conventional random forest algorithms and recent hyperbolic classifiers.
Submission Length: Regular submission (no more than 12 pages of main content)
Code: https://github.com/LarsDoorenbos/HoroRF
Supplementary Material: pdf
Assigned Action Editor: ~Novi_Quadrianto1
Submission Number: 2195