HyperRAG: Query-Centric Retrieval Augmented Generation with Hyperbolic Structuring

16 Sept 2025 (modified: 21 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: RAG; LLMs
TL;DR: We build a hyperbolic graph to achieve a better retrieval performance to help LLMs generate the answers.
Abstract: Retrieval-Augmented Generation (RAG) has demonstrated significant potential in enhancing question answering systems by supplementing large language models (LLMs) with external knowledge. However, existing approaches focus primarily on retrieving isolated factual knowledge entities while neglecting the critical reasoning relationships. To address this limitation, Graph Retrieval-Augmented Generation (GraphRAG) has emerged as an effective solution, which explicitly integrates structured knowledge graphs into LLMs to support complex reasoning tasks. Although diverse corpus retrieval methods have been explored, they typically rely on static, query-agnostic graphs constructed via fixed heuristics. We are thereby motivated to propose a query-centric retrieval framework that adaptively constructs a graph tailored to each query. However, it is challenging to accurately identify these latent relationships from queries to the corpus. Moreover, unifying multiple local-perspective connections into a globally coherent structured corpus introduces additional complexity. To this end, we introduce HyperRAG, a novel framework in the Hyperbolic space that captures both explicit entity-based links and implicit logical connections inferred by the LLM. Our main contributions include: (i) A dual-stage prompting strategy that guides the LLM to identify relevant passages and their implicit relationships based on the query. (ii) A hierarchical graph unification paradigm that models each query-specific graph as a minimal subtree and integrates them into a cohesive graph. (iii) A hyperbolic space embedding approach that effectively preserves the hierarchical structure during graph learning. Extensive experiments have been conducted on three benchmark datasets, where a remarkable improvement on three datasets indicates our superior performance than others.
Primary Area: other topics in machine learning (i.e., none of the above)
Submission Number: 6894
Loading