ASTRO: Automatic Strategy Optimization For Non-Cooperative Dialogues

ASTRO: Automatic Strategy Optimization For Non-Cooperative Dialogues

ACL ARR 2024 December Submission234 Authors

11 Dec 2024 (modified: 05 Feb 2025)ACL ARR 2024 December SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Non-cooperative dialogues, such as negotiations and persuasion, present significant challenges for large language models (LLMs) due to the lack of inherent cooperation or shared goals. Current methods for optimizing dialogue strategies require substantial human effort for strategy optimization. To address these challenges, we propose ASTRO (Automated Strategy Optimization), a fully automated solution that leverages LLMs' self-envolving capabilities. ASTRO dynamically generates customized strategy sets based on task goals and optimizes strategy planner using a self-play reinforcement learning paradigm. Our experimental results demonstrate ASTRO's significant performance improvements over baseline models across various non-cooperative dialogue tasks, highlighting the potential for autonomously developing such agents without human intervention. Our code and data will be openly released.

Paper Type: Long

Research Area: Dialogue and Interactive Systems

Research Area Keywords: task-oriented

Contribution Types: NLP engineering experiment

Languages Studied: English

Submission Number: 234

Loading