Pick Your Textual Gradients

Yifan Xu; Yixuan Li; Xinzhuo Li; Lijun Yu; Haohan Wang

Pick Your Textual Gradients

Yifan Xu, Yixuan Li, Xinzhuo Li, Lijun Yu, Haohan Wang

17 Sept 2025 (modified: 22 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Prompt Optimization, Large Language Model, Automatic Prompt Optimization, Continual Learning

Abstract: Automated prompt optimization using textual gradients is a promising approach to improve the performance of Large Language Models (LLMs) with the guidance of natural language feedback. However, the iterative application of these gradients is notoriously unstable. We identify two primary sources of this instability: 1) gradient noise from correctly handled examples, and 2) a loss of generalization, where performance on simpler tasks declines due to over-specialization on complex cases. To address this, we propose a novel framework that stabilizes the optimization process through two core mechanisms: $\textbf{Error-Driven Refinement}$ and $\textbf{Regularized Verification}$. First, the error-driven approach ensures a high-quality learning signal by exclusively generating textual gradients from "picking" incorrect model outputs, thereby mitigating the noise introduced by correctly handled examples. Second, the regularized verification step treats each resulting prompt update as a candidate, which is "picked" only if it passes a preservation test on a fixed holdout set of general examples, ensuring that targeted improvements do not compromise broad robustness. Experiments on several complex instruction-following and reasoning benchmarks demonstrate that our framework drastically reduces optimization instability, prevents performance degradation on general test cases, and consistently finds more robust prompts than standard iterative methods. Our work provides a principled approach to harnessing textual gradients with a high-quality learning signal and preventing specialization-induced degradation, thus enabling a more stable and effective methodology for automated prompt optimization.

Primary Area: foundation or frontier models, including LLMs

Submission Number: 9790

Loading