Co-learning Planning and Control Policies Constrained by Differentiable Logic Specifications
Abstract: Synthesizing planning and control policies in
robotics is a fundamental task, further complicated by factors
such as complex logic specifications and high-dimensional robot
dynamics. This paper presents a novel reinforcement learning
approach to solving high-dimensional robot navigation tasks
with complex logic specifications by co-learning planning and
control policies. Notably, this approach significantly reduces the
sample complexity in training, allowing us to train high-quality
policies with much fewer samples compared to existing reinforcement learning algorithms. In addition, our methodology
streamlines complex specification extraction from map images
and enables the efficient generation of long-horizon robot motion paths across different map layouts. Moreover, our approach
also demonstrates capabilities for high-dimensional control and
avoiding suboptimal policies via policy alignment. The efficacy
of our approach is demonstrated through experiments involving
simulated high-dimensional quadruped robot dynamics and a
real-world differential drive robot (TurtleBot3) under different
types of task specifications
Loading