KVEraser: Learning to Steer KV Cache for Efficient Localized Context Erasing

Published: 04 Jun 2026, Last Modified: 04 Jun 2026ICML MemFM 2026 Workshop OralEveryoneRevisionsBibTeXCC BY 4.0
Keywords: large language models, KV cache, retrieval-augmented generation, prompt injection, long-context reasoning
Abstract: Context erasing for KV cache is challenging because a local edit has a global consequence: once a span has been processed, its influence propagates into the cached states of all subsequent tokens. Exact erasing must recompute all tokens after the deleted span. We introduce KVEraser, a learned KV-cache editing method for efficient localized context erasing. Experiments show that KVEraser matches full recomputation's performance on in-domain tasks across 1K--32K context lengths, while its latency increases by only 24\% compared with a 17.6$\times$ increase for full recomputation. KVEraser also generalizes to unseen QA tasks with harmful factual distractors, achieving a 3--4$\times$ speedup over full recomputation.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 44
Loading