PathCoRAG: Multi-step Reasoning with Path-Aware CoT Expansion for RAG

ACL ARR 2025 July Submission1074 Authors

29 Jul 2025 (modified: 29 Aug 2025)ACL ARR 2025 July SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Large Language Models (LLMs) have significantly advanced natural language understanding, yet they often struggle with complex, multi-step reasoning due to limitations in fixed-context knowledge access. Retrieval-Augmented Generation (RAG) frameworks address this by incorporating external knowledge, but conventional methods typically retrieve flat, chunk-level text without respecting the logical structure of reasoning, leading to fragmented and noisy contexts. We introduce PathCoRAG, a novel RAG framework that explicitly aligns multi-step reasoning with path-aware retrieval and context construction. Unlike prior methods, PathCoRAG performs step-wise query decomposition and retrieves nodes and paths corresponding to each reasoning step. This produces a logic-preserving, sequential context structure that guides the LLM through a structured chain of thought during generation. Our approach consists of four tightly integrated components: (1) Chain-of-Thought-based Query Expansion, (2) Hierarchical Node Extraction per reasoning step, (3) Semantic Path Exploration and Scoring, and (4) Structured Context Generation aligned with logical reasoning paths. Experimental results across diverse domains that PathCoRAG consistently outperforms strong baselines. \url{https://anonymous.4open.science/r/PathCoRAG-A1BB}
Paper Type: Long
Research Area: Generation
Research Area Keywords: domain adaptation, retrieval-agumented generation, inference methods
Contribution Types: Model analysis & interpretability, NLP engineering experiment, Reproduction study, Publicly available software and/or pre-trained models
Languages Studied: English
Submission Number: 1074
Loading