Data-Augmented Prompt Optimization for LLMs

Data-Augmented Prompt Optimization for LLMs

ACL ARR 2025 February Submission6776 Authors

16 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: With the rapid advancement of Large Language Models (LLMs), their performance has become increasingly dependent on prompt quality, making prompt optimization a critical research area. However, existing techniques often face limitations due to the lack of diverse and challenging data necessary for fully testing and refining prompt effectiveness. To address this challenge, we propose a novel approach that integrates data augmentation into the prompt optimization process, leveraging the synergy between synthetic data generation and prompt optimization. Our framework centers on prompt optimization as the key driver of LLM performance, integrating data augmentation as a means to enhance this process. We introduce a multi-agent system consisting of an auto prompt optimization agent and a synthetic data generation agent, where the former serves as the core component, iteratively refining prompts through a closed-loop feedback mechanism. This dynamic interplay enables the LLM to engage in self-reflection and multi-step reasoning, structurally improving prompt quality over time. Our results demonstrate that enhancing prompt optimization with data augmentation significantly improves model performance. Our approach surpasses existing methods in classification, question answering, and reasoning tasks across multiple benchmarks.

Paper Type: Long

Research Area: NLP Applications

Research Area Keywords: NLP Applications, Question Answering

Contribution Types: NLP engineering experiment, Approaches to low-resource settings

Languages Studied: English

Submission Number: 6776

Loading