Keywords: AI Safety, Monitoring, Auditing, AI Governance, AI Control
Abstract: For small and medium sized enterprises (SMEs), many without significant in-house AI expertise, the opportunities presented by successful adoption of advanced AI technologies such as LLMs have to be weighed against the risks to their businesses, and to society. In this paper we propose a set of techniques to allow SMEs, as well as third party developers and evaluators, to specify product-specific (temporally extended) behavioral constraints such as safety constraints, norms, rules and regulations, and to perform offline auditing or online (runtime) monitoring to assess compliance. To do so, we adapt and extend mechanisms from formal methods, historically used in process monitoring, for use with advanced AI systems (notably, LLMs). We further provide techniques for predictive monitoring and introduce intervening monitors that act at runtime to preempt and potentially mitigate predicted violations. We evaluate several black-box intervention techniques and demonstrate empirically that our predictive and intervening monitors can reduce violation rates in current LLM-based agents.
Submission Number: 146
Loading