Nego-Evol: An Evolution Framework with Behavioral Cloning and Environment Modeling for Goal-oriented LLM Agents
Keywords: Negotiation, Dialogue Generation, LLM Evolution, GRPO
Abstract: Strategic planning plays a pivotal role in guiding effective responses for Large Language Model (LLM)-powered agents in goal-oriented task. Existing approaches typically rely on selecting pre-defined strategies and then fine-tune LLMs on static datasets. However, these methods easily result in homogeneous responses and fall short in exploring more unseen strategies in diverse scenarios. To address these limitations, we introduce Nego-Evol, a training-based evolution framework that improves negotiation capabilities of LLMs within behavioral cloning as well as environment modeling. Specifically, we first equip policy model with fundamental capabilities and prior knowledge at behavioral cloning stage, then iteratively leverage MCTS to synthesize high-quality data and perform Grouped Reward Policy Optimization with multi-turn simulation. Extensive experiments on two mainstreamed benchmarks demonstrate that Nego-Evol enhanced its negotiation capabilities progressively during evolution and eventually outperforms existing baselines. Moreover, Nego-Evol exhibits the spontaneous emergence of new strategies, paving the way for adapting to more diverse negotiation settings.
Paper Type: Long
Research Area: AI/LLM Agents
Research Area Keywords: LLM agents,agent coordination and negotiation,reinforcement learning in agents,planning in agents
Contribution Types: NLP engineering experiment
Languages Studied: English
Submission Number: 6891
Loading