Proximal Policy Optimization with Elo-based Opponent Selection and Combination with Enhanced Rolling Horizon Evolution Algorithm

Rongqin Liang, Yuanheng Zhu, Zhentao Tang, Mu Yang, Xiaolong Zhu

Published: 2021, Last Modified: 10 May 2023CoG 2021Readers: Everyone

Abstract: Two-player zero-sum video game is a basic and important problem in game artificial intelligence. In 2020, enhanced rolling horizon evolution algorithm with policy gradient (ERHEAPI) beat heuristics, Monte-Carlo tree search and other methods to win the championship of Fighting Game Artificial Intelligence Competition (FTGAIC). However, the performance of ERHEAPI in the first round was not good. In this paper, we present an effective method noted as ERHEAPPO that combines proximal policy optimization (PPO) and enhanced rolling horizon evolution algorithm (ERHEA) with opponent model learning to further improve performance. We train the PPO agent and find that the Elo-based opponent selection can improve the sample efficiency. We compare the performance of the proposed ERHEAPPO with ERHEAPI. The experimental results demonstrate the effectiveness of ERHEAPPO.

0 Replies