- Keywords: multiagent, reinforcement learning, iterated dominance, mechanism design, Nash equilibrium
- TL;DR: For games that are solvable by iterated elimination of dominated strategies, we prove that simple standard reinforcement learning algorithms converge to the iterated dominance solution.
- Abstract: Multiagent reinforcement learning (MARL) attempts to optimize policies of intelligent agents interacting in the same environment. However, it may fail to converge to a Nash equilibrium in some games. We study independent MARL under the more demanding solution concept of iterated elimination of strictly dominated strategies. In dominance solvable games, if players iteratively eliminate strictly dominated strategies until no further strategies can be eliminated, we obtain a single strategy profile. We show that convergence to the iterated dominance solution is guaranteed for several reinforcement learning algorithms (for multiple independent learners). We illustrate an application of our results by studying mechanism design for principal-agent problems, where a principal wishes to incentivize agents to exert costly effort in a joint project when it can only observe whether the project succeeded, but not whether agents actually exerted effort. We show that MARL converges to the desired outcome if the rewards are designed so that exerting effort is the iterated dominance solution, but fails if it is merely a Nash equilibrium.