Hierarchical Agenda Reasoning for Strategic Multi-Turn Dialogue Agents

Marwa Abdulhai; Ryan Cheng; Aryansh Shrivastava; Aviral Kumar; Sergey Levine

Hierarchical Agenda Reasoning for Strategic Multi-Turn Dialogue Agents

Marwa Abdulhai, Ryan Cheng, Aryansh Shrivastava, Aviral Kumar, Sergey Levine

Published: 02 Mar 2026, Last Modified: 10 Apr 2026LLA 2026 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: large language models, negotiation, reasoning

TL;DR: enabling better strategic dialogue with hierarchical reasoning abstractions

Abstract: Large language models (LLMs) fine-tuned with reinforcement learning from human feedback (RLHF) are optimized for cooperative instruction following, which makes them poorly suited for strategic decision-making in long-horizon dialogue. To test the ability of LLMs to engage in strategic dialogue, we introduce StrategicBench, a benchmark of 30 negotiation tasks inspired by Harvard Program on Negotiation materials. Our benchmark reveals that while reasoning-oriented LLMs outperform instruction-tuned models, they fail to revise their strategy in response to opponent behavior. To address this limitation, we introduce Hierarchical Agenda Reasoning (HAR), a hierarchical reasoning framework that that explicitly separates what an agent seeks to achieve from how it acts in multi-turn dialogue. HAR structures generation around persistent goal representations that guide the selection and revision of tactics across turns, enabling agents to backtrack from failed strategies without abandoning larger objectives. We find that HAR outperforms instruction-tuned and prompting-based approaches, achieving higher agreement rates, better outcomes, and improved conversation quality in human evaluations. Lastly, multi-turn RL fine-tuning with HAR agents leads to generalization of negotiation performance across unseen tasks and opponent personalities.

Submission Number: 43

Loading