Keywords: Action Policy Verification;RL Policy Analysis;Trustworthy sequential action decision making
TL;DR: We present a new state-of-the-art action policy verification method for neural network and tree ensemble policies based on the well-known IC3 verification algorithm
Abstract: The use of machine learning in sequential decision-making tasks has grown substantially, intensifying concerns regarding the safety of learned policies and motivating research on policy verification. We present a new policy verification method based on the well-known IC3 algorithm. Unlike existing approaches, ours decouples reasoning about policy decisions from reasoning about the effects of these decisions on the environment in which the policy is executed. This separation allows us to leverage the latest advances in machine learning certification tools to handle the former subproblem, whilst relying on specialized solvers for the latter. Experiments confirm that our approach scales better and supports a wider variety of policy architectures than current state-of-the-art methods.
Supplementary Material: zip
Primary Area: other topics in machine learning (i.e., none of the above)
Submission Number: 12978
Loading