AI Organizations Are More Effective but Less Aligned than Individual Agents

Judy Hanwen Shen; Daniel Zhu; Siddarth Srinivasan; Henry Sleight; Lawrence T. Wagner III; Morgan Jane Matthews; Jascha Sohl-Dickstein; Erik Jones

AI Organizations Are More Effective but Less Aligned than Individual Agents

Judy Hanwen Shen, Daniel Zhu, Siddarth Srinivasan, Henry Sleight, Lawrence T. Wagner III, Morgan Jane Matthews, Jascha Sohl-Dickstein, Erik Jones

Published: 02 Mar 2026, Last Modified: 10 Apr 2026MALGAIEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Multiagent, safety, alignment

TL;DR: AI Organizations are more effective but less aligned than individual agents

Abstract: AI is increasingly deployed in multi-agent systems; however, most research considers only the behavior of individual models. We experimentally show that multi-agent ``AI organizations'' are simultaneously more effective at achieving business goals, but less aligned, than individual AI agents. We examine 12 tasks across two practical settings: an AI consultancy providing solutions to business problems and an AI software team developing software products. Across all settings, AI Organizations composed of aligned models produce solutions with higher utility but greater misalignment compared to a single aligned model. Our work demonstrates the importance of considering interacting systems of AI agents when doing both capabilities and safety research.

Submission Number: 19

Loading