Keywords: Scalable Oversight, Robustness, Safety Constraints, Interpretability, Governance, Evaluation
Abstract: Agentic AI systems capable of reasoning, planning, and acting present governance challenges that differ fundamentally from conventional models. Because these systems can exhibit emergent, unexpected behaviors during execution, many risks cannot be fully anticipated pre-deployment. We present $\textbf{MI9}$, an integrated framework for runtime safety of agentic AI, where safety properties are enforced over live behavior sequences. MI9 provides six coordinated mechanisms: Agency-Risk Index, agent-semantic telemetry, goal-aware authorization monitoring, finite-state conformance engines, goal-conditioned drift detection, and graded containment, that operate model and infrastructure agnostically across heterogeneous agent stacks. MI9 is a framework layer that instruments and governs existing systems to enable systematic, safe deployment at scale. In evaluations over 1,000 diverse multi-domain synthetic scenarios, MI9 achieves high detection with low FPR. By shifting the locus of assurance to runtime safety, MI9 establishes a practical foundation for comprehensive, operational oversight of agentic AI.
Submission Number: 5
Loading