CODEPROMPTZIP: Code-specific Prompt Compression for Retrieval-Augmented Generation in Coding Tasks with LMs

ACL ARR 2026 January Submission6729 Authors

05 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Prompt Compression, Code Generation, Language Modeling
Abstract: Retrieval-Augmented Generation (RAG) enhances code generation by incorporating retrieved code examples into prompts, but the resulting long-context inputs impose substantial memory and computational overhead. Existing prompt compression techniques are largely designed for natural language and fail to account for the structural and semantic properties of code, while also lacking fine-grained control over compression ratios. We propose \ourtool, a code-aware prompt compression framework for RAG that enables precise length control while preserving critical information. Motivated by type-aware ablation studies, \ourtool leverages static analysis to rank code tokens by information gain and applies a dynamic compression strategy to retain the most informative tokens under a given budget. For incomplete or unparsable code snippets, \ourtool employs a language-model-based compressor trained on analyzable samples and augmented with a copy mechanism to preserve key tokens. Extensive experiments on three code-related tasks demonstrate that \ourtool consistently outperforms entropy-based and distillation-based baselines, achieving improvements of 23.4\%, 28.7\%, and 8.7\%, respectively, while providing accurate control over compression ratios.
Paper Type: Long
Research Area: Semantics: Lexical, Sentence-level Semantics, Textual Inference and Other areas
Research Area Keywords: Retrieval-Augmented Language Models, Code Models, Semantics: Lexical, Sentence-level Semantics, Textual Inference and Other areas
Contribution Types: Approaches to low-resource settings, Publicly available software and/or pre-trained models, Data resources
Languages Studied: Programming Language, Java, English
Submission Number: 6729
Loading