Hierarchical Deep Reinforcement Learning Agent with Counter Self-play on Competitive Games

Anonymous

Sep 27, 2018 (modified: Oct 10, 2018) ICLR 2019 Conference Blind Submission readers: everyone Show Bibtex
  • Abstract: Deep Reinforcement Learning algorithms lead to agents that can solve difficult decision making problems in complex environments. However, many difficult multi-agent competitive games, especially real-time strategy games are still considered beyond the capability of current deep reinforcement learning algorithms, although there has been a recent effort to change this \citep{openai_2017_dota, vinyals_2017_starcraft}. Moreover, when the opponents in a competitive game are suboptimal, the current \textit{Nash Equilibrium} seeking, self-play algorithms are often unable to generalize their strategies to opponents that play strategies vastly different from their own. This suggests that a learning algorithm that is beyond conventional self-play is necessary. We develop Hierarchical Agent with Self-play (HASP), a learning approach for obtaining hierarchically structured policies that can achieve higher performance than conventional self-play on competitive games through the use of a diverse pool of sub-policies we get from Counter Self-Play (CSP). We demonstrate that the ensemble policy generated by HASP can achieve better performance while facing unseen opponents that use sub-optimal policies. On a motivating iterated Rock-Paper-Scissor game and a partially observable real-time strategic game (http://generals.io/), we are led to the conclusion that HASP can perform better than conventional self-play as well as achieve 77% win rate against FloBot, an open-source agent which has ranked at position number 2 on the online leaderboards.
  • Keywords: deep reinforcement learning, self-play, real-time strategic game, multi-agent
  • TL;DR: We develop Hierarchical Agent with Self-play (HASP), a learning approach for obtaining hierarchically structured policies that can achieve high performance than conventional self-play on competitive real-time strategic games.
0 Replies

Loading