Multimodal Policy Search using Overlapping Mixtures of Sparse Gaussian Process Prior

Hikaru Sasaki, Takamitsu Matsubara

2019 (modified: 19 Jul 2022)ICRA 2019Readers: Everyone

Abstract: In this paper, we present a novel policy search reinforcement learning algorithm that can deal with multimodality in control policies based on Gaussian processes. Our approach employs Overlapping Mixtures of Gaussian Processes (OMGPs) for a control policy, in which all the GPs in the mixture are global and overlapped in the input space. We first extend the OMGPs by combing sparse pseudo-input GPs as OMSGPs to reduce its computational cost of learning and prediction suitable for policy search. Then, we derive a novel multimodal policy search algorithm based on variational Bayesian inference by placing the OMSGPs as the prior of the multimodal control policy. To validate the effectiveness of our algorithm, we applied it to two typical robotic tasks in simulation: 1) object grasping and 2) table-sweep tasks since they both require the multimodality in the optimal policies. Simulation results demonstrate that our algorithm can efficiently learn multimodal policies even with high dimensional observations.

0 Replies