Keywords: Safe Bayesian optimization, reproducing kernel Hilbert spaces, PAC learning, robotics
TL;DR: We propose a safe Bayesian optimization algorithm that over-estimates the RKHS norm with statistical guarantees.
Abstract: Popular safe Bayesian optimization (BO) algorithms successfully control safety-critical systems in unknown environments. However, most algorithms require smoothness assumptions, which are encoded by a norm in a reproducing kernel Hilbert space (RKHS). The RKHS is a potentially infinite-dimensional space and it remains unclear how to reliably obtain the RKHS norm of an unknown function. In this work, we propose a safe BO algorithm capable of estimating the RKHS norm from data. We provide statistical guarantees on the RKHS norm estimation, derive novel confidence intervals for, and prove safety of the resulting safe BO algorithm. We apply our algorithm to safely optimize reinforcement learning policies on physics simulators and on a real Furuta pendulum, demonstrating improved performance, safety, and scalability compared to the state-of-the-art.
Submission Number: 30
Loading