Agents Aren't Agents: the Agency, Loyalty and Accountability Problems of AI agents

ICLR 2026 Conference Submission15008 Authors

19 Sept 2025 (modified: 26 Jan 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: AI agents, agency, alignment, fiduciary duties, large language models, loyalty, accountability
TL;DR: AI agents resemble human Agents but lack personhood and undivided loyalty, making agency law an unreliable governance tool.
Abstract: As AI agents take on responsibilities of increasing breadth and depth, questions of control, loyalty, and accountability become urgent. As AI agents take on responsibilities of increasing breadth and depth, questions of control, loyalty, and accountability become urgent. Common law agency doctrine emerges as a seemingly promising pathway for addressing these alignment challenges. This paper argues that such a translation is not as straightforward as it might first appear. AI agents operate through fragmented layers of control involving developers, hosts, and service providers, which blur lines of responsibility and divide loyalties between many different instructions. These structural differences make it difficult for traditional agency principles, built on assumptions about human intention and deterrence, to fit within the context of AI systems. Agency: in the polyadic governance structure of AI development and deployment, who counts as the principal and who counts as the agent? Loyalty: can AI agents meaningfully serve a principal’s best interests? Accountability: when AI agents make mistakes, who should be held responsible? Relying on common law alone cannot resolve these tensions. Building on these findings, we outline two pathways for drawing on agency law as an interpretive and design-oriented resource. First, statutory reform, such as the EU AI Act and its accompanying liability directives, is necessary, just as legislatures have intervened when governing institutional forms of agency like financial advisers or talent representatives. Second, duty-of-loyalty principles may offer conceptual inspiration for technical implementations that support responsible AI behavior.
Primary Area: alignment, fairness, safety, privacy, and societal considerations
Submission Number: 15008
Loading