Memory Never Fades: Boosting Long Context Processing with Global Memory-Enhanced Retrieval Augmentation

Published: 29 Jan 2025, Last Modified: 29 Jan 2025WWW 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0
Track: Search and retrieval-augmented AI
Keywords: Retrieval-Augmented Generation, Long Context Processing
TL;DR: Enhancing long-context processing with global memory-enhanced retrieval, generating draft answers to guide comprehensive and accurate evidence retrieval.
Abstract: Processing long contexts presents a significant challenge for large language models (LLMs). While recent advancements allow LLMs to handle much longer contexts than before (e.g., 32K or 128K tokens), it is computationally expensive and can still be insufficient for many applications. Retrieval-Augmented Generation (RAG) is considered as a promising strategy to address this problem. However, conventional RAG methods face inherent limitations because of two underlying requirements: 1) explicitly-stated queries, and 2) well-structured knowledge. These conditions, however, do not hold in general long-context processing tasks. In this work, we propose HawkRAG, a novel RAG framework empowered by global memory-augmented retrieval. HawkRAG features a dual-system architecture. The name HawkRAG is inspired by the way a hawk glides high in the sky to observe the land, allowing it to spot and target prey with precision from a broad vantage point. First, it employs a light but long-range system to create a global memory of the long context. Once a task is presented, it generates draft answers, providing useful clues for the retrieval tools to locate relevant information within the long context. Second, it leverages an expensive but expressive system, which generates the final answer based on the retrieved information. Building upon this fundamental framework, we realize the memory module in the form of KV compression, and reinforce its memorization and cluing capacity from the Generation quality's Feedback (a.k.a. RLGF). In our experiments, HawkRAG achieves superior performances across a variety of long-context evaluation tasks, not only complex scenarios where traditional RAG methods struggle, but also simpler ones where RAG is typically applied.
Submission Number: 121
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview