- Keywords: multi-agent environment, continuous adaptation, Nash equilibrium, deep counterfactual regret minimization, reinforcement learning, stochastic game, baseball
- TL;DR: We construct a simplified baseball game scenario to develop and evaluate the adaptation capability of learning agents.
- Abstract: In a multi-agent competitive environment, we would expect an agent who can quickly adapt to environmental changes may have a higher probability to survive and beat other agents. In this paper, to discuss whether the adaptation capability can help a learning agent to improve its competitiveness in a multi-agent environment, we construct a simplified baseball game scenario to develop and evaluate the adaptation capability of learning agents. Our baseball game scenario is modeled as a two-player zero-sum stochastic game with only the final reward. We purpose a modified Deep CFR algorithm to learn a strategy that approximates the Nash equilibrium strategy. We also form several teams, with different teams adopting different playing strategies, trying to analyze (1) whether an adaptation mechanism can help in increasing the winning percentage and (2) what kind of initial strategies can help a team to get a higher winning percentage. The experimental results show that the learned Nash-equilibrium strategy is very similar to real-life baseball game strategy. Besides, with the proposed strategy adaptation mechanism, the winning percentage can be increased for the team with a Nash-equilibrium initial strategy. Nevertheless, based on the same adaptation mechanism, those teams with deterministic initial strategies actually become less competitive.