- Keywords: robotics, control, manipulation, reinforcement learning, deep reinforcement learning
- Abstract: Our approach is based on residual policy learning. We create a hand-designed control policy and learn residual corrections to this policy. The hand-designed policy consists of a position controller that follows a planned motion and a torque controller that acts to apply corrective forces to the cube. We try a series of grasps until we find one that is able to plan to the goal location. For Task 4, we additionally perform a set of scripted manipulations to align the orientation of the cube with the goal before executing our policy. Due to time constraints, our approach is only slightly changed from Phase 1 of the competition and we use the residual models trained for phase one without modification on Levels 2, 3, and 4. See the future work section for details about improving our training pipeline for the real system.