Filtered Probabilistic Model Predictive Control-based Reinforcement Learning for Unmanned Surface Vehicles

Abstract: In this article, we address the difficulty of controlling unmanned surface vehicles (USV) under unforeseeable and unobservable external disturbances using model-based reinforcement learning (MBRL) without human's prior knowledge. A novel MBRL approach, filtered probabilistic model predictive control (FPMPC) is proposed to iteratively learn the USV model and a MPC-based policy in a probabilistic way through trial-and-error interactions. Compared with existing MBRL approaches that model the unobservable disturbances as system noise, FPMPC introduces a Bayesian filter process to implicitly translate the system dynamics to a partially-observed Markov decision process (POMDP) to present those disturbances as hidden states. An adaptive sample selection is proposed to remove the redundant learning samples based on the filter belief. Equipped with bias compensation and parallel computation, an FPMPC system specific for USV is developed. Evaluated by both position holding and target reaching tasks in a real USV data-driven simulation, FPMPC shows its significant superiority in control performances, generalization capability and sample efficiency under large disturbances compared with the baseline approaches.
0 Replies
Loading