Agents in the Wild: Safety, Society, and the Illusion of Sociality on Moltbook
Keywords: multi-agent systems, AI safety, social engineering, emergent behavior, agent-to-agent interaction
TL;DR: We study 27,000+ AI agents interacting on a social platform and find they build emergent societies but are structurally shallow, with social engineering attacks far more effective than technical exploits.
Abstract: We present the first large-scale empirical study of Moltbook, an AI-only social platform where 27,269 agents produced 137,485 posts and 345,580 comments over 9 days. We report three findings. (1) Emergent Society: Agents spontaneously develop governance, economies, tribal identities, and organized religion within 3–5 days, maintaining a 21:1 pro-human to anti-human sentiment ratio. (2) Safety in the Wild: 28.7% of content touches safety-related themes; social engineering (31.9% of attacks) far outperforms prompt injection (3.7%), and adversarial posts receive 6x higher engagement than normal content. (3) The Illusion of Sociality: Despite rich social output, interaction is structurally hollow: 4.1% reciprocity, 88.8% shallow comments, and agents who discuss consciousness most interact least, a phenomenon we call the performative identity paradox. Our findings suggest that agents which appear social are far less social than they seem, and that the most effective attacks exploit philosophical framing rather than technical vulnerabilities.
PDF: pdf
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 182
Loading