NodeRAG: Structuring Graph-based RAG with Heterogeneous Nodes

ACL ARR 2025 May Submission5144 Authors

20 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Retrieval-augmented generation (RAG) empowers large language models to access external and private corpora, enabling factually consistent responses in specific domains. By exploiting the inherent structure of the corpus, graph-based RAG methods enrich this process by building a knowledge graph index and leveraging the structural nature of graphs. However, current graph-based RAG approaches seldom prioritize the design of graph structures. Inadequately designed graphs impede the seamless integration of diverse graph algorithms and result in workflow inconsistencies and degraded performance. To further unleash the potential of graph for RAG, we propose NodeRAG, a heterogeneous graph-centric framework that enables the seamless and holistic integration of graph-based methodologies into the RAG workflow. By aligning closely with the capabilities of LLMs, this framework ensures a fully cohesive and efficient end-to-end process. Through extensive experiments, we demonstrate that NodeRAG exhibits performance advantages over previous methods, including GraphRAG and LightRAG, not only in indexing time, query time, and storage efficiency but also in delivering superior question-answering performance on multi-hop benchmarks and open-ended head-to-head evaluations with minimal retrieval tokens.
Paper Type: Long
Research Area: Information Retrieval and Text Mining
Research Area Keywords: Retrieval-Augmented Generation, Information Retrieval, Graph Theory and Algorithms
Contribution Types: NLP engineering experiment
Languages Studied: English
Submission Number: 5144
Loading