Abstract: Independent learners naively employ single-agent learning algorithms in multi-agent systems, oblivious to the effect of other strategic agents present in their environment. This paper studies partially observed N-player mean-field games from a decentralized learning perspective with two primary objectives: (i) to study the convergence properties of independent learners, and (ii) to identify structural properties that can guide algorithm design. Toward the first objective, we study the learning iterates obtained by independent learners, and find that these iterates converge under mild conditions. We then present a notion of subjective equilibrium suitable for analyzing independent learners. Toward the second objective, we study policy updating processes subject to a so-called -satisficing condition: agents who are subjectively -best-responding at a given joint policy do not change their policy. After establishing structural results for such processes, we develop an independent learning algorithm for N-player mean-field games. Exploiting the aforementioned structural results, we give guarantees of convergence to subjective -equilibrium under self-play.
Loading