Abstract: Goal-based agents respond to environments and adjust behaviour accordingly to reach objectives. Understanding incentives of interacting agents from observed behaviour is a core problem in multi-agent systems. Inverse reinforcement learning (IRL) solves this problem, which infers underlying reward functions by observing the behaviour of rational agents. Despite IRL being principled, it becomes intractable when the number of agents grows because of the curse of dimensionality and the explosion of agent interactions. The formalism of Mean field games (MFGs) has gained momentum as a mathematically tractable paradigm for studying large-scale multi-agent systems. By grounding IRL in MFGs, recent research attempts to push the limits of the agent number in IRL. However, the study of IRL for MFGs is far from being mature as existing methods assume strong rationality, while real-world agents often exhibit bounded rationality due to the limited cognitive or computational capacity. Towards a more general and practical IRL framework for MFGs, this paper proposes Mean-Field Adversarial IRL, a novel framework capable of tolerating bounded rationality. We build it upon the maximum entropy principle, adversarial learning, and
a new equilibrium concept for MFGs. We evaluate our machinery on simulated tasks with imperfect demonstrations resulting from bounded rationality. Experimental results demonstrate the superiority of MF-AIRL over existing methods in reward recovery.
Loading