Let there be Frontier Model System Certification

Published: 28 Apr 2026, Last Modified: 28 Apr 2026MSLD 2026 PosterEveryoneRevisionsCC BY 4.0
Keywords: Agentic systems, Formal methods, AI evaluation
TL;DR: We present the first frameworks to formally certify frontier model systems.
Abstract: Frontier Model Systems (FMS) increasingly drive complex agentic workflows but suffer from unpredictable brittleness. Existing evaluation methods—such as empirical benchmarking and adversarial attacks—are fundamentally inadequate, limited by finite test coverage, test-set leakage, and a narrow focus on isolated edge cases. To address this, we propose a paradigm shift of FMS evaluation toward black-box statistical certification. We first introduce a foundational framework that leverages probabilistic programming to establish high-confidence bounds on the probability of desirable model behavior, demonstrating its efficacy by exposing severe vulnerabilities in tool-selection by agentic systems. To generalize our approach across arbitrary domains, we present LUMOS (LangUage MOdel Specifications), the first domain-specific language for FMS certification. LUMOS tackles the informal nature of safety properties by modeling prompt spaces as text-rich graphs, integrating native multimodal support, and utilizing probabilistic constructs for IID scenario sampling and statistical certification. By unifying graph-based logic with probabilistic execution, LUMOS successfully scales certification from text-based tasks to novel visual domains, enabling the first formal safety certification of Vision-Language Models (VLMs) in autonomous driving. This democratizes FMS evaluation with formal certification and exposes subtle vulnerabilities prior to deployment.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 187
Loading