SelfCP: Compressing over-limit prompt via the frozen large language model itself

Jun Gao, Ziqiang Cao, Wenjie Li

Published: 01 Jan 2024, Last Modified: 14 Nov 2024Inf. Process. Manag. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Highlights•We are the first to use the frozen LLM itself to compress over-limit prompts.•We achieve a balance among training cost, inference efficiency, and response quality.•Our method is more general and cost-efficient than existing compression methods.