Nego-Evol: An Evolution Framework with Behavioral Cloning and Environment Modeling for Goal-oriented LLM Agents

Nego-Evol: An Evolution Framework with Behavioral Cloning and Environment Modeling for Goal-oriented LLM Agents

ACL ARR 2026 January Submission6891 Authors

06 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Negotiation, Dialogue Generation, LLM Evolution, GRPO

Abstract: Strategic planning plays a pivotal role in guiding effective responses for Large Language Model (LLM)-powered agents in goal-oriented task. Existing approaches typically rely on selecting pre-defined strategies and then fine-tune LLMs on static datasets. However, these methods easily result in homogeneous responses and fall short in exploring more unseen strategies in diverse scenarios. To address these limitations, we introduce Nego-Evol, a training-based evolution framework that improves negotiation capabilities of LLMs within behavioral cloning as well as environment modeling. Specifically, we first equip policy model with fundamental capabilities and prior knowledge at behavioral cloning stage, then iteratively leverage MCTS to synthesize high-quality data and perform Grouped Reward Policy Optimization with multi-turn simulation. Extensive experiments on two mainstreamed benchmarks demonstrate that Nego-Evol enhanced its negotiation capabilities progressively during evolution and eventually outperforms existing baselines. Moreover, Nego-Evol exhibits the spontaneous emergence of new strategies, paving the way for adapting to more diverse negotiation settings.

Paper Type: Long

Research Area: AI/LLM Agents

Research Area Keywords: LLM agents,agent coordination and negotiation,reinforcement learning in agents,planning in agents

Contribution Types: NLP engineering experiment

Languages Studied: English

Submission Number: 6891

Loading