Reinforcement Learning for Robot Navigation with Adaptive ExecutionDuration (AED) in a Semi-Markov Model

Abstract: Deep reinforcement learning (DRL) algorithms have proven effective in robot navigation, especially in unknown environments, through directly mapping perception inputs into robot control commands. However, most existing methods ignore the local minimum problem in navigation thereby cannot handle complex unknown environments. In this paper, we propose the first DRL-based navigation method modeled by a SMDP with continuous action space, Adaptive Forward Simulation Time (AFST), to overcome this problem. Specifically, we improve the distributed proximal policy optimization (DPPO) algorithm for the specified SMDP problem by modifying its GAE to better estimate the policy gradient in SMDPs. We evaluate our approach both in the simulator and the real world.
0 Replies
Loading