CC-RAG: Structured Multi-Hop Reasoning via Theme-Based Causal Graphs

CC-RAG: Structured Multi-Hop Reasoning via Theme-Based Causal Graphs

ACL ARR 2025 May Submission5497 Authors

20 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Understanding cause-and-effect relationships remains a formidable challenge for Large Language Models (LLMs), particularly in specialized domains where reasoning requires more than surface-level correlations. Retrieval-Augmented Generation (RAG) improves factual accuracy, but standard RAG pipelines treat evidence as flat context, lacking the structure required to model true causal dependencies. We introduce \textbf{Causal-Chain RAG (CC-RAG)}, a novel framework that integrates zero-shot triple extraction and theme-aware graph chaining into the RAG pipeline, enabling structured multi-hop inference. Given a domain-specific corpus, CC-RAG constructs a Directed Acyclic Graph (DAG) of $\langle \mathit{cause}, \mathit{relation}, \mathit{effect} \rangle$ triples and applies forward/backward chaining to guide structured answer generation. Experiments across two real-world domains: Bitcoin price fluctuations and Gaucher disease, demonstrate that CC-RAG outperforms standard RAG and zero-shot LLMs in chain similarity, information density, and lexical diversity. Both LLM-as-a-Judge and human evaluations consistently favor CC-RAG. Our results show that explicitly modeling causal structure allows LLMs to generate more accurate and interpretable responses, particularly in specialized domains where flat retrieval fails.

Paper Type: Long

Research Area: Language Modeling

Research Area Keywords: retrieval-augmented generation, multihop QA, reasoning, interpretability, causality, knowledge tracing/discovering/inducing

Contribution Types: Model analysis & interpretability, NLP engineering experiment, Publicly available software and/or pre-trained models

Languages Studied: English

Submission Number: 5497

Loading