Graph of Records: Boosting Retrieval Augmented Generation for Long-context Summarization with Graphs
Keywords: Retrieval-Augmented Generation, Long-context Summarization, Graph Neural Networks, Large Language Models
TL;DR: We propose a method named graph of records (GoR), which utilizes LLM-generated historical responses in retrieval-augmented generation to enhance long-context summarization
Abstract: Retrieval-augmented generation (RAG) has revitalized Large Language Models (LLMs) by injecting non-parametric factual knowledge.
Compared with long-context LLMs, RAG is considered an effective summarization tool in a more concise and lightweight manner, which can interact with LLMs multiple times using diverse queries to get comprehensive responses. However, the LLM-generated historical responses, which contain potentially insightful information, are largely neglected and discarded by existing approaches, leading to suboptimal results. In this paper, we propose \textit{graph of records} (\textbf{GoR}), which leverages historical responses generated by LLMs to enhance RAG for long-context global summarization. Inspired by the \textit{retrieve-then-generate} paradigm of RAG, we construct a graph by creating an edge between the retrieved text chunks and the corresponding LLM-generated response. To further uncover the sophisticated correlations between them, GoR further features a \textit{graph neural network} and an elaborately designed \textit{BERTScore}-based objective for self-supervised model training, enabling seamless supervision signal backpropagation between reference summaries and node embeddings. We comprehensively compare GoR with 12 baselines on four long-context summarization datasets, and the results indicate that our proposed method reaches the best performance. Extensive experiments further demonstrate the effectiveness of GoR.
Primary Area: learning on graphs and other geometries & topologies
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 10834
Loading