Reproducibility Study of "Cooperate or Collapse: Emergence of Sustainable Cooperation in a Society of LLM Agents"

TMLR Paper4983 Authors

28 May 2025 (modified: 17 Jun 2025)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: As large language models (LLMs) are increasingly deployed in multi-agent systems for cooperative tasks, understanding their decision-making behavior in resource-sharing scenarios becomes critically important for AI safety and governance. This study provides a comprehensive reproducibility analysis and theoretical extension of Piatti et al.'s $\texttt{GovSim}$ framework, which models cooperative behavior in multi-agent systems through the lens of common resource dilemmas. Beyond faithful reproduction of core claims regarding model-size dependencies and universalization principles, we contribute two theoretically-motivated extensions that advance our understanding of LLM cooperation dynamics: (1) $\textbf{Loss aversion in resource framing}$, showing that negative resource framing (elimination of harmful resources) fundamentally alters agent behavior patterns, with models like $\texttt{GPT-4o-mini}$ succeeding in loss-framed scenarios while failing in gain-framed ones, a finding with significant implications for prompt engineering in cooperative AI systems; and (2) $\textbf{Heterogeneous influence dynamics}$, demonstrating that high-performing models can systematically elevate the cooperative behavior of weaker models through communication, enabling resource-efficient deployment strategies. These findings establish fundamental principles for deploying LLMs in cooperative multi-agent systems: cooperation emerges from model capability, resource framing significantly impacts behavioral stability, and strategic model mixing can amplify system-wide performance. Our work provides essential guidance for practitioners designing AI systems where multiple agents must cooperate to achieve shared objectives, from autonomous economic systems to collaborative robotics.
Submission Length: Regular submission (no more than 12 pages of main content)
Previous TMLR Submission Url: https://openreview.net/forum?id=qpu0mtKfh7&referrer=%5BAuthor%20Console%5D(%2Fgroup%3Fid%3DTMLR%2FAuthors%23your-submissions)
Changes Since Last Submission: ## 1. Major Theoretical Enhancements ### 1.1 Strengthened Theoretical Framework - Incorporated formal grounding in **behavioral economics** (e.g., prospect theory and loss aversion), **game theory**, and **cognitive science**, providing a rigorous conceptual basis for our experiments. - Developed **formal theoretical predictions** for each experimental extension, explicitly connecting them to established literature. - Framed our empirical results within **AI safety and governance**, emphasizing their broader relevance beyond technical performance. ### 1.2 Deeper Analysis of Experimental Extensions To address the request for expanded experiments and clearer insights, we enriched the analysis of each experimental setting: - **Cross-architectural generalization**: Added a more detailed comparisons between **Mixture-of-Experts (MoE)** and **dense models**, examining their effects on cooperative behavior and why this comparison is important. - **Cross-linguistic consistency**: Strengthened the motivation by linking it to research on **cultural biases in AI systems** and challenges in **global deployment**. - **Loss aversion framing**: Introduced a robust theoretical foundation based on **Kahneman & Tversky’s prospect theory**, highlighting behavioral differences under loss versus gain framing. - **Heterogeneous influence dynamics**: Expanded our analysis to include **peer influence effects** and formalized how resource allocation impacts cooperation. --- ## 2. Stronger Framing and Broader Impact We revised the manuscript to better highlight the broader implications and clearly position our work within ongoing debates in AI research and practice. ### 2.1 Audience Alignment and Impact - Aligned our contributions with major challenges in **multi-agent AI**, including **autonomous economic systems**, **collaborative robotics**, and **distributed intelligence**. - Enhanced the discussion on **ethical considerations** and **AI safety**, ensuring responsible framing of scenarios and their implications. ### 2.2 Comprehensive Literature Integration - Included foundational references to support our theoretical and experimental contributions: - *Kahneman & Tversky (1979)* – Prospect Theory - *Fehr & Gächter (2000)* – Reciprocity in economics - *Henrich et al. (2001)* – Cultural fairness norms - *Russell (2019)* – Human-compatible AI - More thoroughly integrated our findings with literature on **behavioral AI** and **multi-agent systems**, reinforcing the novelty and relevance of the work.
Assigned Action Editor: ~Marc_Lanctot1
Submission Number: 4983
Loading