Beyond Vibe Decision Theory: Asymmetric Manipulation Vulnerabilities in LLM Multi-Agent Coordination
Keywords: Multi-Agent Systems, Prompt Manipulation, AI Alignment
TL;DR: Exploratory experiments show that cooperative advice strongly sways LLMs in competitive contexts, while competitive advice barely affects cooperative contexts, revealing asymmetric vulnerabilities in multi-agent coordination.
Abstract: Language models are increasingly deployed in multi-agent environments to coordinate outcomes steered solely through natural language.
We conduct an investigation of what happens when contextual framing and explicit strategic instructions conflict. Prior work demonstrates that models exhibit sensitivity to contextual framing in strategic games; however, the magnitude of these effects and their interaction with direct advice remain unclear. Through exploratory experiments across multiple strategic scenarios, we find a striking asymmetric pattern: competitive contexts show high susceptibility to cooperative influence (up to a 34% shift in cooperation), while cooperative contexts demonstrate strong resistance to competitive manipulation (up to a 2% shift). This asymmetry reveals context-engineering failure modes despite instruction tuning, raising open questions about alignment, instruction fidelity, and the robustness of multi-agent coordination.
Submission Number: 16
Loading