Automatic Task-aware Instruction Optimizer for Black-box LLMs

Yunzhe Qi; Jinjin Tian; Ruirui Li; Tianci Liu; Tianxin Wei; Hui Liu; Xianfeng Tang; Monica Xiao Cheng; Jingrui He

Automatic Task-aware Instruction Optimizer for Black-box LLMs

Yunzhe Qi, Jinjin Tian, Ruirui Li, Tianci Liu, Tianxin Wei, Hui Liu, Xianfeng Tang, Monica Xiao Cheng, Jingrui He

25 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Large Language Models; Instruction Optimization

Abstract: Large Language Models (LLMs) have demonstrated superior capabilities in terms of solving various real-world tasks. However, their performance and generated content quality heavily depend on task-relevant instructions, which makes instruction optimization a challenging but critical direction to explore. In particular, as practitioners generally cannot access black-box (or API) LLMs' internal parameters and gradient information, it consequently makes instruction optimization for black-box LLMs especially non-trivial. Existing methods for optimizing black-box LLM instructions mainly focus on in-context learning using manually designed or heuristic disciplines, which can be insufficient due to the extreme complexity of modern black-box LLMs that can contain hundreds of billions of parameters. To address these challenges, we propose a novel automatic instruction optimization framework named Automatic Instruction Optimizer (AIO). AIO is designed to perceive target task information and adaptively adjust its task-aware instructing strategy for a task-solver black-box LLM. By leveraging a white-box LLM with parameter fine-tuning for enhanced representation power, AIO can automatically update its instructing strategy based on the feedback from task-solver black-box LLM. To achieve this goal, AIO adopts a novel LLM parameter fine-tuning process powered by zeroth-order gradient approximation and Contextual Bandit techniques, which can effectively and efficiently help address the challenge of inaccessible black-box LLM internal parameters and gradients, as well as help alleviate expensive API cost concerns by flexibly reusing collected black-box LLM feedback. Extensive empirical evaluations are presented to demonstrate properties of our proposed AIO, and its effectiveness in comparison with strong baselines.

Supplementary Material: zip

Primary Area: foundation or frontier models, including LLMs

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 4891

Loading