SAKGC: An Iterative RAG-Powered Framework for Stereo Knowledge Graph Construction

ACL ARR 2025 May Submission948 Authors

16 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Research on LLM-based knowledge-graph (KG) automation is accelerating, while retrieval-augmented generation (RAG) is becoming the de-facto strategy for grounding large language models in external facts. Together these trends highlight a pressing demand for KGs that are not only domain-specialised but also endowed with multi-layer explanations that an LLM can traverse when reasoning. We introduce \textbf{SAKGC}, an iterative two-phase framework that (i) extracts and organises large volumes of heterogeneous data into a compact horizontal KG and (ii) uses RAG to attach complexity-aware, hierarchical explanations to every non-trivial entity. Extensive experiments on three corpora of increasing scale show that SAKGC improves triple accuracy, reduces redundancy and internal-knowledge leakage, and boosts answer correctness and chain-of-thought clarity in downstream QA. Code and data are available at {https://anonymous.4open.science/r/SAKGC-3E67}.
Paper Type: Long
Research Area: Information Extraction
Research Area Keywords: knowledge base construction, document-level extraction, open information extraction, zero/few-shot extraction, relation extraction
Contribution Types: Model analysis & interpretability, NLP engineering experiment, Publicly available software and/or pre-trained models, Data resources
Languages Studied: English
Submission Number: 948
Loading