Keywords: Hyperbolic embeddings, Hyperbolic geometry, Hierarchical data, Tree metrics, Geometric machine learning, Metric learning, Riemannian optimization
Abstract: When embedding hierarchical graph data (e.g., trees), practitioners face a fundamental
choice: increase Euclidean dimension or use low-dimensional hyperbolic spaces. We provide
a deployable decision rule, backed by rigorous theory and designed to integrate into
graph-learning pipelines, that determines which geometry to use based on tree structure
and desired distortion tolerance. For balanced $b$-ary trees of height $h$ with heterogeneous edge weights, we prove that any
embedding into fixed $d$-dimensional Euclidean space must incur distortion scaling as
$(b^{\lfloor h/2\rfloor})^{1/d}$, with the dependence on weight heterogeneity being tight. Beyond balanced trees, we extend the lower bound to arbitrary trees via an
effective width parameter that captures the count of edge-disjoint depth-$r$ suffixes.
Under random edge perturbations, we provide high-probability refinements that improve the constants
while preserving the fundamental scaling, and we further show these refinements remain valid under locally correlated or $\alpha$-mixing noise processes on edges.
On the hyperbolic side, we present an explicit constant-distortion construction in the hyperbolic plane with concrete curvature and
radius requirements, demonstrating how negative curvature can substitute for additional
Euclidean dimensions. These results yield a simple decision rule: input basic (possibly unbalanced)
tree statistics (height, effective width, weight spread) and a target distortion, and receive either
(i) the minimum Euclidean dimension needed, or (ii) feasible hyperbolic parameters achieving the target
within budget. Finally, we show that for general DAGs, a tree-minor witness transfers our lower bound,
so the decision rule remains applicable.
Submission Number: 129
Loading