An Effective Negotiating Agent Framework based on Deep Offline Reinforcement LearningDownload PDF

Published: 08 May 2023, Last Modified: 26 Jun 2023UAI 2023Readers: Everyone
Keywords: Automated negotiation, Reinforcement learning, Agent-based and Multi-agent Systems, Agreement Technologies
TL;DR: A novel Deep Offline Reinforcement learning Negotiating Agent (DOREA) framework can learn a strategy from offline dataset and adapt it to opponent changes.
Abstract: Learning is crucial for automated negotiation, and recent years have witnessed a remarkable achievement in application of reinforcement learning (RL) for various negotiation tasks. Conventional RL methods focus generally on learning from active interactions with opposing negotiators. However, collecting online data is expensive in many realistic negotiation scenarios. While previous studies partially mitigate this problem through the use of opponent simulators (i.e., agents following known strategies), in reality it is usually hard to fully capture an opponent’s negotiation strategy. Moreover, a further challenge lies in an agent's capability of adapting to dynamic variations of an opponent's preferences or strategies, which may happen from time to time for different reasons in subsequent negotiations. In response to these challenges, this article proposes a novel Deep Offline Reinforcement learning Negotiating Agent framework that allows to learn an effective strategy using previously collected negotiation datasets without requiring interaction with an opponent. This is in contrast to existing RL-based negotiation approaches that all rely on active interaction with opponents. Furthermore, the strategy fine-tuning mechanism is included to adjust the learned strategy in response to the preferences or strategy changes of the opponent. The performance of the proposed framework is evaluated based on a diverse set of state-of-the-art baselines under different settings. Experimental results show that the framework allows to learn effective strategies exclusively with offline datasets, and is also capable of effectively adapting to changes of an opponent's preferences or strategy.
Supplementary Material: pdf
Other Supplementary Material: zip
0 Replies