Keywords: Cooperative AI, LLM Agents, Cooperation Mechanisms, Social Dilemma, Game Theory, direct reciprocity, indirect reciprocity, contracting, mediators
TL;DR: We evaluate four game-theoretically grounded cooperation mechanisms with regards to how effective they are with LLMs in four social dilemmas, and simultaneously benchmark the LLMs.
Abstract: It is increasingly important that LLM agents interact effectively and safely with other goal-pursuing agents, yet, according to recent works, the opposite trend appears to be the case: LLMs with stronger reasoning capabilities behave _less_ cooperatively in mixed-motive games such as the prisoner's dilemma and in public goods settings. Indeed, our experiments show that recent models---with or without reasoning enabled---consistently defect on the other players in single-shot social dilemmas.
To tackle this safety concern, we study game-theoretic mechanisms that are designed to enable cooperative outcomes between rational agents _in equilibrium_. Across four social dilemmas testing distinct components of robust cooperation, we evaluate under the following mechanisms: (1) repeating the game for many rounds, (2) reputation systems, (3) third-party mediators to delegate decision making to, and (4) contract agreements for outcome-conditional payments between players. Among our findings, we establish that contracting and mediation are most effective in achieving cooperative outcomes between capable LLM models, and that repetition-induced cooperation deteriorates drastically when co-players vary. Moreover, we demonstrate that these cooperation mechanisms become _more effective_ with higher pressures to optimize for one own's utility.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 270
Loading