Keywords: Retrieval-Augmented Generation, Discourse-Aware Generation, Rhetorical Structure Theory, Large Language Models
Abstract: Retrieval-Augmented Generation (RAG) has emerged as an important means for enhancing the performance of large language models (LLMs) in knowledge-intensive tasks. However, most existing RAG strategies treat retrieved passages as flat and unstructured text, which prevents the model from capturing structural cues and constrains its ability to synthesize dispersed evidence and to reason across documents. Although a few recent approaches attempt to incorporate structural signals, each remains restricted to shallow representations such as entity graphs or dependency edges and thus fails to capture hierarchical discourse organization. To overcome these limitations, we propose Discourse-RAG, a structure-aware framework that explicitly injects discourse signals into the generation process. Our method constructs intra-chunk rhetorical structure theory (RST) trees to capture local coherence hierarchies and builds inter-chunk rhetorical graphs to model cross-passage discourse flow. These structures are jointly integrated into a planning blueprint that conditions the generation. Experiments on question answering and long-document summarization benchmarks show the efficacy of our approach. Discourse-RAG achieves a new state-of-the-art ROUGE-L score of 42.4 on ASQA dataset and improves LLM Score by 12.79 points over standard RAG on Loong benchmark. These findings underscore the important role of discourse structure in advancing retrieval-augmented generation.
Primary Area: applications to computer vision, audio, language, and other modalities
Submission Number: 3252
Loading