Better by Comparison: Retrieval-Augmented Contrastive Reasoning for Automatic Prompt Optimization

Juhyeon Lee, Wonduk Seo, Hyunjin An, Seunghyun Lee, Yi Bu

Published: 2025, Last Modified: 19 Mar 2026JCDL 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Automatic prompt optimization has recently emerged as a strategy for improving the quality of prompts used in Large Language Models (LLMs), with the goal of generating more accurate and useful responses. However, most prior work focuses on direct prompt refinement or model finetuning, overlooking the potential of leveraging LLMs’ inherent reasoning capability to learn from contrasting examples. In this paper, we present Contrastive Reasoning Prompt Optimization (CRPO), a novel framework that formulates prompt optimization as a retrieval-augmented reasoning process. Our approach retrieves top- k reference prompt-response pairs from the HelpSteer2 dataset, an open-source collection where each response is annotated for helpfulness, correctness, coherence, complexity, and verbosity, and constructs two complementary optimization paradigms: (1) tiered contrastive reasoning, where the LLM compares high-, medium-, and low-quality exemplars (both prompts and responses) to refine its own generation through reflective reasoning, and (2) multi-metric contrastive reasoning, where the LLM analyzes the best exemplars along each evaluation dimension and integrates their strengths into an optimized prompt. By explicitly contrasting high- and lowquality exemplars, CRPO enables the model to deduce why certain prompts succeed while others fail, thereby achieving more robust and interpretable optimization. Experimental results on the HelpSteer2 benchmark demonstrate that CRPO significantly outperforms baselines. Our findings highlight the promise of contrastive, retrieval-augmented reasoning for advancing automatic prompt optimization.11Code is Available at: https://github.com/GitLeo1/CRPO
Loading