Abstract: Retrieval Augmentation Generation (RAG) has significantly mitigated hallucination issues in Large Language Models (LLMs), with context compression playing a pivotal role in enhancing the efficiency of the RAG systems. Traditional context compression approaches include extractive and abstractive methods. Extractive methods often perform poorly due to their independent modeling of sentences, while abstractive methods suffer from high latency and the risk of introducing hallucinations. In this paper, we propose VIG, a novel context compression method that reformulates the task as verifiable index generation, ensuring minimal inference latency. VIG effectively models semantic interactions between sentences, prevents potential hallucinations during compression, and offers adaptive control over the compression rate. Extensive experiments across three knowledge-intensive tasks confirm the effectiveness and efficiency of our method.
Paper Type: Long
Research Area: Information Retrieval and Text Mining
Research Area Keywords: Context Compression, Retrieval Augmented Generation
Contribution Types: NLP engineering experiment
Languages Studied: English
Submission Number: 4025
Loading