Keywords: Large Language Models, Instruction Optimization
TL;DR: Facilitate instruction optimization for black-box LLM by leveraging the preimage structure over the soft prompts and instructions.
Abstract: Large language models (LLMs) have achieved remarkable success across diverse domains, due to their strong instruction-following capabilities. This raised interest in optimizing instructions for black-box LLMs, whose internal parameters are inaccessible but popular for their strong performance and ease of use. Recent approaches leverage white-box LLMs to assist instruction optimization for black-box LLMs by generating instructions from soft prompts. However, white-box LLMs often map different soft prompts to the same instruction, leading to redundant queries to the black-box model. While previous studies regarded this many-to-one mapping as a redundancy to be avoided, we reinterpret it as useful prior knowledge that can enhance the optimization performance. To this end, we introduce PREimage-informed inSTruction Optimization (PRESTO), a novel framework that leverages the preimage structure of soft prompts to improve query efficiency. PRESTO consists of three key components: (1) score sharing, which shares the evaluation score with all soft prompts in a preimage; (2) preimage-based initialization, which select initial data points that maximize search space coverage using preimage information; and (3) score consistency regularization, which enforces prediction consistency within each preimage. By leveraging preimages, PRESTO observes 14 times more scored data under the same query budget, resulting in more efficient optimization. Experimental results on 33 instruction optimization tasks demonstrate the superior performance of PRESTO.
Supplementary Material: zip
Primary Area: Deep learning (e.g., architectures, generative models, optimization for deep networks, foundation models, LLMs)
Submission Number: 12346
Loading