Keywords: Model-based Reinforcement learning, Safe-RL, Evolutionary method, Planning
Abstract: In the last decade, reinforcement learning successfully solved complex control
tasks and decision-making problems, like the Go board game. Yet, there are few
success stories when it comes to deploying those algorithms to real-world scenarios.
One of the reasons is the lack of guarantees when dealing with and avoiding unsafe
states, a fundamental requirement in critical control engineering systems. In this
paper, we introduce Guided Safe Shooting (GuSS), a model-based RL approach
that can learn to control systems with minimal violations of the safety constraints.
The model is learned on the data collected during the operation of the system in an
iterated batch fashion, and is then used to plan for the best action to perform at each
time step. We propose three different safe planners, one based on a simple random
shooting strategy and two based on MAP-Elites, a more advanced divergent-search
algorithm. Experiments show that these planners help the learning agent avoid
unsafe situations while maximally exploring the state space, a necessary aspect
when learning an accurate model of the system. Furthermore, compared to model-
free approaches, learning a model allows GuSS reducing the number of interactions
with the real-system while still reaching high rewards, a fundamental requirement
when handling engineering systems.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Submission Guidelines: Yes
Please Choose The Closest Area That Your Submission Falls Into: Reinforcement Learning (eg, decision and control, planning, hierarchical RL, robotics)
Supplementary Material: zip
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/guided-safe-shooting-model-based/code)
6 Replies
Loading