Robot Learning by Collaborative Network Training: A Self-Supervised Method using Ranking

Mason Bretan, Sageev Oore, Siddharth Sanan, Larry P. Heck

Published: 2019, Last Modified: 13 Nov 2024AAMAS 2019EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: We introduce Collaborative Network Training -- a self-supervised method for training neural networks with aims of: 1) enabling task objective functions that are not directly differentiable w.r.t. the network output; 2) generating continuous-space actions; 3) more direct optimization for achieving a desired task; 4) learning parameters when a process for measuring performance is available, but labeled data is unavailable. The procedure involves three randomly initialized independent networks that use ranking to train one another on a single task. The method incorporates qualities from ensemble and reinforcement learning as well as gradient free optimization methods such as Nelder-Mead. We evaluate the method against various baselines using a variety of robotics-related tasks including inverse kinematics, controls, and planning in both simulated and real-world environments.