An Axiomatic Framework for N-Agent Ad Hoc Teamwork: From Shapley Axioms to Learning

An Axiomatic Framework for N-Agent Ad Hoc Teamwork: From Shapley Axioms to Learning

ICLR 2026 Conference Submission12937 Authors

18 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Reinforcement Learning, Ad Hoc Teamwork, Multi-Agent Learning, Shapley Value

TL;DR: We propose an axiomatic framework for NAHT, showing how Shapley’s axioms guide reinforcement learning algorithms that improve performance.

Abstract: Open multi-agent systems are increasingly relevant for modelling the emerging real-world domains such as smart grids and swarm robotics. This paper addresses the recently posed problem of n-agent ad hoc teamwork (NAHT), where only a subset of agents is controllable. Existing approaches rely on heuristic designs that lack theoretical grounding. We propose an axiomatic game-theoretic framework for NAHT, formulated via the state-specific cooperative game space. Within this framework, the axiomatic characterization of the Shapley value—Efficiency, Symmetry, and Linearity—is reinterpreted as structural constraints on individual value functions. This yields a principled design space: enforcing all axioms recovers the Shapley value, while dropping Efficiency yields the Banzhaf index, leading to our Banzhaf Machine variant. As concrete instantiations, we develop Shapley Machine and Banzhaf Machine, which enforce different subsets of axioms during learning. Implemented on IPPO and POAM, these algorithms provide stronger performance; notably, relaxing the Efficiency axiom may even outperform enforcing the full set in terms of agent type generalization.

Supplementary Material: zip

Primary Area: reinforcement learning

Submission Number: 12937

Loading