Abstract: Long-context inputs in large language models (LLMs) often suffer from the "lost in the middle" problem, where critical information becomes diluted or ignored due to excessive length. Context compression offers a promising solution, however, current compression methods still have notable limitations: hard prompt methods often suffer from low compression ratios, while soft prompt methods tend to lose critical task-relevant information and lack adaptability.
We propose ATACompressor, an adaptive, task-aware context compressor that combines the strengths of both paradigms. ATACompressor (1) efficiently compresses context into compact soft prompts, (2) selectively preserves task-relevant information through a trained encoder, and (3) dynamically adjusts compression rates via an adaptive controller. Experiments on three QA benchmarks demonstrate that ATACompressor achieves state-of-the-art performance while maintaining high efficiency.
Paper Type: Long
Research Area: Efficient/Low-Resource Methods for NLP
Research Area Keywords: Efficient/Low-Resource Methods for NLP, Question Answering
Contribution Types: Approaches to low-resource settings
Languages Studied: English
Submission Number: 3922
Loading