HyperRAG: Hierarchy-Aware Retrieval-Augmented Generation with Hyperbolic Embeddings for Ontology-Based Entity Linking

16 Sept 2025 (modified: 01 Dec 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Entity Linking, Hyperbolic Embeddings, RAG, Ontology, Hierarchy-aware evaluation metrics, Biomedical NLP
TL;DR: HyperRAG improves ontology-based entity linking in text by uniting LLM span detection, RAG, and hyperbolic hierarchy-aware reranking.
Abstract: Extracting structured knowledge from unstructured text is a fundamental challenge in machine learning, particularly when the target concepts are organized within complex hierarchical ontologies. We present HyperRAG, a novel framework that integrates Large Language Models (LLMs) with Retrieval-Augmented Generation (RAG) and hierarchical reranking using hyperbolic embeddings. Our approach is designed to improve entity linking and retrieval in settings where the label space exhibits rich hierarchical relationships. In addition, we introduce a hierarchy-aware evaluation framework that leverages ontology structure to provide a more nuanced assessment of model performance, moving beyond conventional exact-match metrics. Through comprehensive experiments on both benchmark and real-world datasets, including a newly curated and challenging set of clinical notes for phenotype extraction in precision medicine, we demonstrate that HyperRAG substantially improves ranking accuracy and recall, especially for implicit or nuanced entity mentions. While our primary application is in the biomedical domain, the proposed framework is broadly applicable and generalizable to hierarchical entity linking and retrieval tasks in other domains. All code, models, and datasets are released to support reproducibility.
Primary Area: learning on graphs and other geometries & topologies
Supplementary Material: zip
Submission Number: 7107
Loading