Abstract: Retrieval-Augmented Generation (RAG) enhances large language models (LLMs) by integrating external knowledge into the generation process. However, existing RAG systems face limitations in processing long-form documents, primarily due to their reliance on fragmented, chunk-based retrieval mechanisms, which often fail to capture complex interdependencies.
To address these limitations, we propose a \textbf{G}oT-perspective \textbf{G}raph \textbf{R}etrieval-\textbf{A}ugmented \textbf{G}eneration Paradigm ($G^2$RAG). $G^2$RAG introduces three key innovations: (1) a dynamic graph construction algorithm that adapts to document structure, (2) a dual-level retrieval framework, and (3) a context-aware retrieval scoring function. These components collectively improve retrieval diversity, semantic completeness, and preservation of contextual relationships.
Experimental results demonstrate that $G^2$ RAG achieves an 80\% reduction in inference time compared to LightRAG while maintaining competitive performance on standard query-focused summarization benchmarks. Additionally, we evaluate Graph-Based RAG on multi-hop reasoning tasks, revealing the limitations in handling complex tasks.
Paper Type: Long
Research Area: Information Retrieval and Text Mining
Research Area Keywords: passage retrieval; dense retrieval; document representation
Languages Studied: ENGLISH
Submission Number: 8042
Loading