Keywords: assistive robotics, proximal policy optimization, phasic policy gradient, long short term memory
TL;DR: We evaluate recurrent neural networks and phasic policy gradient algorithm for the task of assistive robotics
Abstract: Assistive Robotics is a class of robotics concerned with aiding humans in daily care tasks that they may be inhibited from doing due to disabilities or age. While research has demonstrated that classical control methods can be used to design policies to complete these tasks, these methods can be difficult to generalize to a variety of instantiations of a task. Reinforcement learning can provide a solution to this issue, wherein robots are trained in simulation and their policies are transferred to real-world machines. In this work, we replicate a published baseline for training robots on three tasks in the Assistive Gym environment, and we explore the usage of a Recurrent Neural Network policy and Phasic Policy Gradient learning to augment the original work. Our baseline implementation meets or exceeds the baseline of the original work, however, we found that our explorations into the new methods was not as effective as we anticipated. We discuss the results of our baseline and analyze why our new methods were not as successful.
0 Replies
Loading