Is my action policy safe? PolIC3 to the rescue

Is my action policy safe? PolIC3 to the rescue

ICLR 2026 Conference Submission12978 Authors

18 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Action Policy Verification;RL Policy Analysis;Trustworthy sequential action decision making

TL;DR: We present a new state-of-the-art action policy verification method for neural network and tree ensemble policies based on the well-known IC3 verification algorithm

Abstract: The use of machine learning in sequential decision-making tasks has grown substantially, intensifying concerns regarding the safety of learned policies and motivating research on policy verification. We present a new policy verification method based on the well-known IC3 algorithm. Unlike existing approaches, ours decouples reasoning about policy decisions from reasoning about the effects of these decisions on the environment in which the policy is executed. This separation allows us to leverage the latest advances in machine learning certification tools to handle the former subproblem, whilst relying on specialized solvers for the latter. Experiments confirm that our approach scales better and supports a wider variety of policy architectures than current state-of-the-art methods.

Supplementary Material: zip

Primary Area: other topics in machine learning (i.e., none of the above)

Submission Number: 12978

Loading