Evolving Prompts In-Context: An Open-ended, Self-replicating Perspective

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
TL;DR: A novel prompt design paradigm for LLMs, resembling the open-ended nature of biological evolution.
Abstract: We propose a novel prompt design paradigm that challenges conventional wisdom in large language model (LLM) prompting. While conventional wisdom prioritizes well-crafted instructions and demonstrations for in-context learning (ICL), we show that pruning random demonstrations into seemingly incoherent ''gibberish'' can remarkably improve performance across diverse tasks. Notably, the ''gibberish'' always matches or surpasses state-of-the-art automatic prompt optimization techniques, achieving substantial gains regardless of LLM alignment. Nevertheless, discovering an effective pruning strategy is non-trivial, as existing attribution methods and prompt compression algorithms fail to deliver robust results, let alone human intuition. In terms of this, we propose a self-discover prompt optimization framework, PromptQuine, an evolutionary search framework that automatically searches for the pruning strategy by itself using only low-data regimes. Much like the emergent complexity in nature—such as symbiosis and self-organization—arising in response to resource constraints, our framework evolves and refines unconventional yet highly effective prompts by leveraging only the tokens present within the context. We demonstrate its effectiveness across classification, multi-choice question answering, generation and math reasoning tasks across LLMs, while achieving decent runtime efficiency. We hope our findings can guide mechanistic studies on in-context learning, and provide a call to action, to pave the way for more open-ended search algorithms for more effective LLM prompting.
Lay Summary: We use computers to create the most effective prompts to help large language models (LLMs) perform specific tasks more accurately. Traditionally, these prompts are written in natural language, using clear instructions and carefully selected examples—a method known as in-context learning. While this approach is effective, we ask a deeper question: is natural language truly the optimal way to communicate with LLMs? Through extensive experiments, we uncover a surprising new insight into in-context learning with LLMs. Concretely, we find that pruning random demonstrations into seemingly incoherent “gibberish” can surprisingly enhance task performance—consistently matching state-of-the-art prompt optimization results. The findings hold consistently across the LLMs we examined, irrespective of their alignment. To assist other researchers in exploring this idea—whether from a practical perspective, to develop better prompt optimization algorithms or stabilize in-context learning, or from a mechanistic perspective, to deepen our understanding of the surprising effectiveness of pruning, we have developed and released our algorithms (e.g., TAPruning and PromptQuine), which show good final task performance, decent runtime efficiency as well as promising scalability.
Link To Code: https://github.com/jianyu-cs/PromptQuine/
Primary Area: Deep Learning->Large Language Models
Keywords: Prompt Optimization, In-context Learning, AI Alignment, Self-improvement, Open-Endedness
Submission Number: 2432
Loading