META-GOVERNANCE ARCHITECTURES FOR MULTI-AGENT SYSTEM SAFETY, ALIGNMENT, GOVERNANCE, AND SECURITY

Published: 01 Mar 2026, Last Modified: 24 Apr 2026ICLR 2026 AIWILDEveryoneRevisionsCC BY 4.0
Keywords: MAS, AI Governance, AI Risks
TL;DR: Providing Governance Agents and Deterministic Controls to Probabilistic Multi Agent Systems.
Abstract: Enterprise deployment of autonomous multi-agent systems (MAS) has surged, yet existing governance frameworks designed for traditional software or single-agent systems prove inadequate for managing emergent behaviors, coordination vulnerabilities, and distributed agency. We introduce \textbf{meta-governance}, by means of SafeAlign AI Governance and Responsible AI OS via the use of specialized intelligent agents to monitor and control operational agent fleets, as a scalable paradigm for achieving comprehensive Safety, Alignment, Governance, and Security (SAGS) in production MAS deployments. Through analysis of regulatory requirements (EU AI Act, NIST AI RMF, Singapore Framework), documented failure modes, and novel attack vectors including inter-agent trust exploitation, we establish design principles for production-grade MAS governance systems. We validate these principles through deployment scenarios in regulated industries (financial services, healthcare, and pharmaceuticals), managing 100+ operational agents, demonstrating that meta-governance can achieve sub-second intervention latency, 100\% safety-critical policy compliance, and $>90\%$ automated decision handling while maintaining comprehensive audit trails. Our framework addresses the fundamental asymmetry between attack propagation speed and human oversight capacity, enabling enterprises to deploy autonomous agents at scale with regulatory compliance and risk mitigation.
PDF: pdf
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 228
Loading