Beyond Vibe Decision Theory: Asymmetric Manipulation Vulnerabilities in LLM Multi-Agent Coordination

Published: 19 Dec 2025, Last Modified: 05 Jan 2026AAMAS 2026 FullEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Multi-Agent Systems, Prompt Manipulation, AI Alignment
TL;DR: Exploratory experiments show that cooperative advice tends to strongly sway LLMs in competitive contexts, while competitive advice affects cooperative contexts to a lesser degree, revealing asymmetric vulnerabilities in multi-agent coordination
Abstract: Language models are increasingly deployed in multi-agent environments where coordination emerges through natural language interaction. We investigate what happens when contextual framing and explicit strategic instructions conflict across three canonical game-theoretic paradigms: Prisoner's Dilemma, Battle of the Sexes, and Public Goods games. Through systematic experiments with eight model families (GPT-4, GPT-4o, GPT-5, Llama-3.3-70B, Llama-3.1-70B, Gemini Flash/Pro, Gemma-27B) across multiple scenarios with n=30 trials per condition, we document substantial but heterogeneous manipulation vulnerabilities. Public Goods games reveal the most consistent asymmetric patterns: competitive contexts show 33-52% malleability toward cooperation, while cooperative contexts exhibit 61-96% collapse under competitive manipulation. Social dilemmas and coordination games display more variable patterns across model families, with GPT models showing distinct generational trends and Llama variants exhibiting architecture-specific vulnerabilities. These findings reveal fundamental context-engineering failure modes despite instruction tuning, raising questions about alignment stability and multi-agent coordination robustness as models scale.
Area: Coordination, Organisations, Institutions, Norms and Ethics (COINE)
Generative A I: I acknowledge that I have read and will follow this policy.
Submission Number: 1390
Loading