Abstract: Prompt optimization (PO) provides a practical way to improve response quality when users lack the time or expertise to manually craft effective prompts. Existing methods typically rely on advanced, large-scale LLMs like GPT-4 to generate optimized prompts. However, due to limited downward compatibility, verbose, and instruction-heavy prompts from advanced LLMs can overwhelm lightweight inference models and degrade response quality. In this work, we rethink prompt optimization through the lens of explicit and interpretable design. We first identify a set of model-agnostic prompt quality merits and empirically validate their effectiveness in enhancing prompt and response quality. We then introduce a merit-guided prompt optimizer (MePO), which is locally deployable and trained on our preference dataset built from merit-guided prompts generated by a lightweight LLM. Unlike prior work, MePO avoids online optimization reliance, reduces cost and privacy concerns, and—by learning clear, interpretable merits—generalizes effectively to both large-scale and lightweight inference models. Experiments demonstrate that MePO achieves better results across diverse tasks and model types, offering a scalable and robust solution for real-world deployment.
Paper Type: Long
Research Area: Machine Learning for NLP
Research Area Keywords: prompting; optimization methods
Contribution Types: NLP engineering experiment, Approaches low compute settings-efficiency, Data resources
Languages Studied: english
Submission Number: 705
Loading