Abstract: This work studies how to play with unknown opponents in bi-
lateral negotiation game where two parties of different inter-
ests try to reach census following the stacked alternating offer
protocol. When being faced with different types of opponents
using unknown strategies, it is critically essential for the ne-
gotiator to learn about opponents from observations and then
find the best response in order to achieve efficient agreements.
A novel approach is proposed based on deep Bayesian poli-
cy reuse+, which includes two key components, a learning
module based on deep reinforcement learning to learn a new
response policy when encountering an opponent using a pre-
viously unseen strategy and a policy reuse mechanism to effi-
ciently detect the strategy of an opponent and select the opti-
mal response policy from the policy library. The performance
of our agent is evaluated against winning agents of ANAC
competitions under varied negotiation scenarios. The experi-
mental results show that the proposed agent outperforms ex-
isting state-of-the-art agents, and is also able to make efficient
detection and optimal response against unknown opponents.
0 Replies
Loading