Hierarchical Mamba Meets Hyperbolic Geometry: A New Paradigm for Structured Language Embeddings

TMLR Paper6831 Authors

06 Jan 2026 (modified: 17 Jan 2026)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Selective state-space models excel at long-sequence modeling, but their capacity for language representation, in complex hierarchical reasoning -- remains underexplored. Most large language models rely on flat Euclidean embeddings, limiting their ability to capture latent hierarchies. To address this, we propose Hierarchical Mamba (HiM), integrating efficient Mamba2 with hyperbolic geometry to learn hierarchy-aware language embeddings for deeper linguistic understanding. Mamba2-processed sequences are projected to the Poincaré ball or Lorentzian manifold with "learnable" curvature, optimized with a hyperbolic loss. Our HiM model facilitates the capture of relational distances across varying hierarchical levels, enabling effective long-range reasoning for tasks like mixed-hop prediction and multi-hop inference in hierarchical classification. Experimental results show both HiM effectively capture hierarchical relationships across four linguistic and medical datasets, surpassing Euclidean baselines, with HiM-Poincaré providing fine-grained distinctions with higher h-norms, while HiM-Lorentz offers more stable, compact, and hierarchy-preserving embeddings.
Submission Type: Regular submission (no more than 12 pages of main content)
Previous TMLR Submission Url: https://openreview.net/forum?id=5UAe37a7T5
Changes Since Last Submission: Changes since submission 6782: 1. Removed the public GitHub link from the manuscript and replaced it with an anonymous GitHub repository link suitable for double-blind review. 2. Fully anonymized the codebase by removing any identifying information that could reveal author identity.
Assigned Action Editor: ~Nadav_Cohen1
Submission Number: 6831
Loading