Aligning Self-Interested Agents with Welfare Maximization in Non-cooperative Equilibrium via Mechanism Design
Abstract: Self-interested multi-agent reinforcement learning has attracted growing attention for its applicability in real-world scenarios. In such settings, social dilemmas often arise, where agents prioritize individual gains over social welfare. Therefore, addressing these social dilemmas is critical for improving social welfare. However, prior work has notable limitations: (1) opponent modeling and incentive design approaches rely heavily on access to other agents’ internal parameters and detailed information. As the number of agents increases or access to such information becomes limited, accurately modeling others’ impact becomes difficult, leading to degraded performance; (2) centralized training is often ineffective, as relying on a single global training signal fails to capture the heterogeneous objectives and behaviors of self-interested agents, limiting effective individual policy learning. To overcome these limitations, we propose a mechanism design approach that leverages centralized information rather than centralized learning, without requiring access to other agents’ internal parameters. Such mechanism dynamically reshapes each agent’s reward to align individual incentives with social welfare. Building on this mechanism, we develop a value iteration algorithm that integrates a counterfactual critic and a maximized return predictor, further improving learning effectiveness. Extensive experiments on social dilemma environments demonstrate that our method achieves higher social welfare compared with existing baselines.
Submission Type: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Sungsoo_Ahn1
Submission Number: 7058
Loading