From Interaction Trajectories to Prompt Rules: Credit Assignment for Multi-Agent Prompt Optimization

Published: 30 Apr 2026, Last Modified: 24 Jun 2026ICML 2026 regularEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Large language model (LLM)-based multi-agent systems commonly rely on natural-language prompts to specify agent behavior, yet optimizing these prompts remains challenging when agent roles and interaction structures are fixed by design. In such systems, behaviors emerge over long, noisy interaction trajectories, making it difficult to determine which prompt components are responsible for success or failure. As a result, outcome-level feedback alone is insufficient, while existing prompt optimization methods typically rely on final task scores or global prompt rewrites, limiting their ability to exploit trajectory evidence or support the localized updates. We propose Trajectory-based Rule Credit Estimation (TRUCE), a framework for prompt optimization in multi-agent systems that explicitly addresses this credit assignment challenge. TRUCE performs trajectory-aware attribution by linking outcome feedback to informative sub-trajectories and translating the resulting credit signals into unit-level edits over prompt-defined behavioral rules. By preserving agent roles and interaction structures, TRUCE enables prompt refinement through localized updates aggregated across tasks. Experiments on multiple benchmarks demonstrate that TRUCE consistently improves task performance and efficiency over competitive baselines. Code is available at https://github.com/bingo-w/TRUCE.
Lay Summary: Large language models are increasingly used in systems where multiple AI agents collaborate to solve complex tasks, such as research assistance, coding, or negotiation. In these systems, agent behavior is largely controlled by natural-language prompts, but improving those prompts is difficult. When a multi-agent system succeeds or fails, the final outcome alone does not reveal which specific instructions helped or caused problems, because many agents interact over long and complex conversations. We introduce TRUCE, a method for automatically improving prompts in multi-agent AI systems. Instead of only looking at the final result, TRUCE analyzes the interaction process itself, identifies which parts of the trajectory were most responsible for success or failure, and links those observations back to specific behavioral rules in the prompts. It then makes small, targeted prompt updates rather than rewriting everything at once. Experiments on collaborative and competitive multi-agent benchmarks show that TRUCE consistently improves task performance, coordination quality, and efficiency compared with existing prompt optimization methods. This work helps make multi-agent AI systems easier to improve automatically, reducing reliance on manual expert prompt engineering as these systems become more complex.
Primary Area: Optimization->Everything Else
Keywords: Prompt Optimization, Multi-agent System, Credit Assignment
Originally Submitted PDF: pdf
Submission Number: 25701
Loading