Keywords: Hyperbolic Geometry, Modern Hopfield Networks, Associative Memory, Riemannian Optimization, Hierarchical Representation Learning
TL;DR: We extend modern Hopfield networks to hyperbolic space with a model-agnostic, Riemannian energy formulation, yielding a plug-and-play memory module that excels on deeply hierarchical data while matching Euclidean baselines on shallow cases.
Abstract: Associative memory models encode a set of candidate patterns as “memories” and, upon receiving a partial or noisy query, retrieve the patterns most relevant to the query via similarity interactions/energy minimization, thereby recovering or recalling target patterns from incomplete inputs; they have achieved widespread success across many perception and representation learning tasks. However, when the retrieval process is constrained to Euclidean geometry, hierarchical structure in the data is difficult to capture accurately: in many tasks that require handling hierarchical data, Hopfield networks based on Euclidean representations tend to introduce bias and distortion into semantic relations.
To this end, we extend modern Hopfield retrieval to hyperbolic space. Specifically, we map query and memory vectors from Euclidean space to hyperbolic space via exponential maps, and define an energy function with clear theoretical grounding based on the Minkowski inner product; the retrieval procedure adopts Riemannian manifold optimization, combining curvature-aware gradients with exponential maps to ensure that the optimization trajectory remains on the manifold and yields stable updates.
Our central view can be stated as a hierarchy-sensitivity hypothesis: when the data exhibit clear and deeper hierarchical structure, hyperbolic geometry brings statistically significant improvements; when the hierarchy is weak or only shallow, performance shows no significant difference from Euclidean modern Hopfield networks. We validate this through depth-controlled comparisons and cross-level consistency metrics, and the empirical results are consistent with the hypothesis.
Accordingly, the proposed hyperbolic associative memory can serve as a plug-and-play general memory module embedded into task architectures that require hierarchical understanding, for storing and retrieving raw inputs, intermediate representations, or learned prototypes, and explicitly exploiting hierarchical information.
Moreover, our method is formulated in a model-agnostic manner and applies to any hyperbolic model with constant negative curvature. In this paper, we instantiate it with the Poincaré ball for experiments.
Supplementary Material: zip
Primary Area: learning on graphs and other geometries & topologies
Submission Number: 22346
Loading