- Abstract: The Actor-Critic framework of multi-agent reinforcement learning(MARL) is gathering more attention nowadays. Centralized training with decentralized execution allows the policies to use extra information to ease the training while enhancing overall performance. In such a framework, the quality of critic profoundly impacts the final average rewards. Thus we present a method, called Scholastic-Actor-Critic(SMAC), that involves a more powerful critic to maintain efficiency in ample knowledge acquisition. The headmaster critic is designed to group agents with proper size and proper timing, while other critics update simultaneously at the decision time. The learning rule includes additional terms account for the impact of other agents within a group. Our method receives higher payouts compared to other state-of-the-art methods and is robust against the explosion of dimension during training. We apply our method to the Coin Game, the Cooperative Treasure Collection(CTC) and a dynamic battle game, MAgent. Experiment results are all satisfying.
- Keywords: multi-agent reinforcement learning, Actor-Critic