Learning to Maximize Return in a Stag Hunt Collaborative Scenario through Deep Reinforcement Learning

Abstract: In this paper we present a deep reinforcement learning approach for learning to play a time extended social dilemma game in a simulated environment. Agents face different types of adversaries with different levels of commitment to a collaborative strategy. Our method builds on recent advances in policy gradient training using deep neural networks. We investigate multiple stochastic gradient algorithms such as Reinforce or Actor Critic with auxiliary tasks for faster convergence.
0 Replies
Loading