GCR: Generative Compressing for Retrieval Augmented Generation

GCR: Generative Compressing for Retrieval Augmented Generation

ACL ARR 2024 December Submission370 Authors

13 Dec 2024 (modified: 05 Feb 2025)ACL ARR 2024 December SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Retrieval Augmentation Generation (RAG) has significantly mitigated hallucination issues in Large Language Models (LLMs), with context compressing playing a pivotal role in enhancing the efficiency of the RAG systems. Traditional context compressing approaches include extractive and abstractive methods. Extractive methods often perform poorly due to their independent modeling of sentences, while abstractive methods suffer from high latency and the risk of introducing hallucinations. In this paper, we propose GCR, a novel generative compression method that reformulates context compression as sentence index generation, ensuring minimal inference latency. GCR effectively models semantic interactions between sentences, prevents potential hallucinations during compression, and offers adaptive control over the compression rate. Extensive experiments across three knowledge-intensive tasks confirm the effectiveness and efficiency of our method.

Paper Type: Long

Research Area: Language Modeling

Research Area Keywords: Context Compression, Augmented Generation

Contribution Types: NLP engineering experiment

Languages Studied: English

Submission Number: 370

Loading