TableKV: KV Cache Compression for In-Context Table Processing

Published: 05 Jun 2025, Last Modified: 05 Jun 2025TRL@ACL2025EveryoneRevisionsBibTeXCC BY 4.0
Keywords: tabular data, compression, KV cache, LLM, table, QA
TL;DR: Attention-guided KV cache compression can outperform RAG and match full-context performance, enabling processing large tables directly within LLMs during inference.
Abstract: Processing large tables provided in-context to LLMs is challenging due to token limits and information overload. While Retrieval-Augmented Generation can select relevant subsets externally, this work explores Key-Value (KV) cache compression as an alternative, applied directly to the linearized table during inference. We show that the LLM's internal attention scores over the table context guides the retention of essential KV pairs, effectively compressing the processing context while preserving crucial relational information needed for complex queries. Experiments on Spider, WikitableQA, and QTSumm datasets validate the compression approach for in-context table processing, offering a promising path for improved table representation learning in LLMs.
Include In Proceedings: Yes
Submission Number: 19
Loading