TopoRAG: Graph-based RAG via Topology-aware Approximate Nearest Neighbor Search

ACL ARR 2026 January Submission9543 Authors

06 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Retrieval-Augmented Generation, Graph-based RAG, Approximate Nearest Neighbor Search, Cohesive Subgraph Discovery, Large Language Models
Abstract: Retrieval-augmented generation (RAG) has become a core technique for improving the factuality and reasoning ability of large language models. Recent efforts extend RAG with graph-structured knowledge, enhancing retrieval to capture relational context beyond isolated text chunks. However, many graph-based RAG systems rely on a two-stage pipeline: (i) classical approximate nearest neighbor (ANN) search to identify top-$k$ entities in the embedding space, (ii) heuristic neighbor expansion which augments the retrieved set by traversing immediate neighbors. This design underutilizes graph topology during retrieval and often introduces noisy or high-degree neighbors, leading to suboptimal evidence selection. In this paper, we propose TopoRAG, a retrieval framework that directly integrates structural constraints into ANN search via a diameter-constrained formulation. By selecting entities whose induced subgraph satisfies a diameter bound, TopoRAG enables topology-aware and noise-controlled graph retrieval. Experiments show that our approach consistently improves precision and significantly reduces context redundancy compared to existing methods.
Paper Type: Long
Research Area: Retrieval-Augmented Language Models
Research Area Keywords: retrieval-augmented generation, passage retrieval, graph-based methods, knowledge-augmented methods, structured prediction, multihop QA
Contribution Types: NLP engineering experiment
Languages Studied: English
Submission Number: 9543
Loading