Enhancing GraphRAG with Beam Search-Based Path Filtering and Semantic Diversity Score

Enhancing GraphRAG with Beam Search-Based Path Filtering and Semantic Diversity Score

ACL ARR 2025 February Submission3876 Authors

15 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Graph-based Retrieval-Augmented Generation (GraphRAG) enhances Large Language Models (LLMs) by integrating structured knowledge graphs, but it faces challenges in suboptimal path selection and redundant entity retrieval. To address these, we propose three key improvements: (1) $LLM-driven Structured Entity Extraction$, which enhances query understanding by extracting structured entities prior to retrieval; (2) $Beam Search-based Path Filtering$, which selects globally coherent reasoning paths over greedy nearest neighbor search; and (3) $Semantic Diversity Score (SDS)$, a novel metric that reduces redundancy by quantifying the diversity of retrieved entity clusters. We evaluate our approach on multiple-choice QA datasets: MCTest, LexGLUE CaseHold, PubMedQA, and MedQA. Our method improves accuracy by $+1.16\%$, $+6.53\%$, $+4.9\%$, and $+0.31\%$ compared to the baseline LLaMA 3.1-8B, demonstrating enhanced retrieval informativeness and path coherence. Additionally, experiments on various LLMs, including Qwen2.5-7B, Gemma2-9B, and LLaMA 3.1-8B, show accuracy increases of $+12.34\%$, $+22.50\%$, and $+1.33\%$ on MCTest, respectively. While our method improves factual consistency and reasoning quality, further work is needed to adapt SDS to domain-specific tasks such as biomedical question answering.

Paper Type: Long

Research Area: Generation

Research Area Keywords: Generation, Machine Learning for NLP, NLP Applications, Question Answering

Contribution Types: NLP engineering experiment, Approaches to low-resource settings

Languages Studied: English

Submission Number: 3876

Loading