Low-distortion and GPU-compatible Tree Embeddings in Hyperbolic Space

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
TL;DR: In this paper we propose a method for embedding trees in hyperbolic space by optimizing hyperspherical point separation and using floating point expansion arithmetic for maintaining GPU-compatibility.
Abstract: Embedding tree-like data, from hierarchies to ontologies and taxonomies, forms a well-studied problem for representing knowledge across many domains. Hyperbolic geometry provides a natural solution for embedding trees, with vastly superior performance over Euclidean embeddings. Recent literature has shown that hyperbolic tree embeddings can even be placed on top of neural networks for hierarchical knowledge integration in deep learning settings. For all applications, a faithful embedding of trees is needed, with combinatorial constructions emerging as the most effective direction. This paper identifies and solves two key limitations of existing works. First, the combinatorial construction hinges on finding highly separated points on a hypersphere, a notoriously difficult problem. Current approaches achieve poor separation, degrading the quality of the corresponding hyperbolic embedding. We propose highly separated Delaunay tree embeddings (HS-DTE), which integrates angular separation in a generalized formulation of Delaunay embeddings, leading to lower embedding distortion. Second, low-distortion requires additional precision. The current approach for increasing precision is to use multiple precision arithmetic, which renders the embeddings useless on GPUs in deep learning settings. We reformulate the combinatorial construction using floating point expansion arithmetic, leading to superior embedding quality while retaining utility on accelerated hardware.
Lay Summary: Organizing and representing knowledge that has a tree-like structure, such as family trees, topic hierarchies, or classification systems, is an important and well-explored challenge in many fields. One powerful way to represent these structures is through hyperbolic geometry, which is much better suited than traditional Euclidean spaces for capturing the nature of hierarchical data. Recent research has shown that hyperbolic representations can even be used by deep learning models to better handle complex, layered knowledge. To obtain good results with such deep learning approaches, the hyperbolic representation of the tree must closely mirror its original structure. A class of methods known as combinatorial constructions has emerged as a promising approach for producing high-quality representations. However, current versions of these methods often lead to poor results for many types of trees. In this paper, we identify two key limitations in existing combinatorial constructions and propose solutions to both. The first issue arises from a step in the construction that depends on solving a long-standing mathematical challenge: the uniform placement of points on a high-dimensional sphere. Existing methods use crude approximations for this step, which leads to inaccurate representations. We introduce highly separated Delaunay tree embeddings (HS-DTE), which uses an improved approach to placing the points, resulting in representations that more faithfully capture the original tree structures. The second issue is computational. Previous methods rely on multiple precision arithmetic, which involves keeping many decimal digits. This type of computation is not supported on a GPU, which is the type of hardware that is needed for deep learning. We reformulate the computations using floating point expansion arithmetic, a technique which has the same benefits while still being compatible with GPUs. The result is a method that produces accurate representations and is suitable for use in modern deep learning systems.
Link To Code: https://github.com/maxvanspengler/hyperbolic_tree_embeddings
Primary Area: General Machine Learning->Representation Learning
Keywords: Hyperbolic Geometry, Hyperbolic Tree Embeddings, Representation Learning, Hierarchical Learning
Submission Number: 9747
Loading