Efficient Encoder-Only Context Compression via Marginal Contribution Scoring

Published: 01 Jun 2026, Last Modified: 09 Jun 2026AdaptFM PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: context compression, efficiency
Abstract: Efficient context compression is critical for retrieval-augmented question answering in resource-constrained settings, where long retrieved contexts increase latency, memory use, and LLM reader cost. We propose a lightweight encoder-only framework for query-driven sentence pruning that preserves answer-critical evidence while aggressively reducing irrelevant context. Our method learns marginal contribution scores for sentences using counterfactual training signals, and optimizes a contrastive ranking objective that separates critical evidence from non-critical context. Unlike decoder-heavy compressors, our approach scores all sentences from a single full-context encoding, enabling fast inference with low computational overhead. Experiments show that it maintains accuracy comparable to the strongest baseline while using 3.7$\times$ less peak memory and substantially lower compression latency, making it suitable for practical resource-constrained deployment.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 79
Loading