TH-RAG : Topic-Based Hierarchical Knowledge Graphs for Robust Multi-hop Reasoning in GraphRAG Systems
Abstract: Retrieval-augmented generation (RAG) enhances large language models (LLMs) by enabling them to incorporate external knowledge at inference time. While graph-based RAG methods have shown promise in multi-hop reasoning by leveraging structured representations such as triplets, they often struggle with semantic sparsity, noisy or inconsistent triplet extraction, and a lack of higher-level abstraction, which together hinder coherent and efficient reasoning. We propose \textbf{TH-RAG}, a novel graph-based RAG framework that constructs \textbf{a three-level hierarchical Knowledge Graph (KG)} composed of entities, subtopics, and topics. TH-RAG maintains high connectivity by semantically organizing triplets through \textbf{Triplet Extraction with Topic}. With \textbf{Topic-based Hierarchical Graph Traversal}, TH-RAG finds related entities through topic and subtopics. Finally, a \textbf{Query-Based Filtering} selects only the most relevant triplets and sentence chunks. Experimental results on both open-domain and multi-hop QA benchmarks demonstrate that TH-RAG consistently outperforms existing strong baselines in terms of accuracy and robustness. To support further research, we release our code at: https://anonymous.4open.science/r/KGRAG-2C8D
Paper Type: Long
Research Area: Generation
Research Area Keywords: domain adaptation, retrieval-augmented generation, inference methods
Contribution Types: Model analysis & interpretability, NLP engineering experiment, Reproduction study, Publicly available software and/or pre-trained models
Languages Studied: English
Submission Number: 517
Loading