StructZip: Compressing Large-Scale Structured Prompts to One Token via Learning Natural Language Descriptions
Keywords: Prompt Compression, LLM
Abstract: Tool use has become a central capability in large language model (LLM)-based agents, enabling them to interact with external environments through structured APIs. However, effective tool use typically requires including a large number of tool descriptions, often with complex schemas, in the context for each inference. This static and structured portion of the prompt grows linearly with the number of tools and poses a significant challenge to inference efficiency. Although prior work has explored prompt compression for long contexts, most approaches focus on unstructured text and are not optimized for the compression of structured prompts. To bridge this gap, we introduce \textbf{StructZip}, a novel framework that transforms large structured prompts into parametric memory, which can be elicited by a single token. Our approach first "unzips" the structured prompt into a set of semantically equivalent question-answer (QA) pairs. By fine-tuning the LLM on these QA pairs, StructZip encodes the information into the model's parameters, making it accessible through a designated special token at inference time. We evaluate our method on three representative tasks: table-based question answering, tool-use, and closed-set text classification. Experimental results demonstrate that StructZip can compress prompts of millions of tokens into a single one while maintaining performance nearly on par with using the full, uncompressed prompts, offering a practical solution for efficient structured data handling in LLMs.
Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Submission Number: 19211
Loading