Body Transformer: Leveraging Robot Embodiment for Policy Learning

Published: 05 Sept 2024, Last Modified: 18 Oct 2024CoRL 2024EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Robot Learning, Graph Neural Networks, Imitation Learning, Reinforcement Learning
Abstract: In recent years, the transformer architecture has become the de-facto standard for machine learning algorithms applied to natural language processing and computer vision. Despite notable evidence of successful deployment of this architecture in the context of robot learning, we claim that vanilla transformers do not fully exploit the structure of the robot learning problem. We propose Body Transformer (BoT), an architecture that exploits the robot embodiment by providing an inductive bias that guides the learning process. We represent the robot body as a graph of sensors and actuators, and rely on masked attention to pool information through the architecture. The resulting architecture outperforms the vanilla transformer, as well as the classical multilayer perceptron, with respect to task completion, scaling properties, and computational efficiency when representing either imitation or reinforcement learning policies.
Supplementary Material: zip
Spotlight Video: mp4
Video: https://sferrazza.cc/bot_site/static/videos/summary.mp4
Website: https://sferrazza.cc/bot_site/
Code: https://github.com/carlosferrazza/BodyTransformer
Publication Agreement: pdf
Student Paper: no
Submission Number: 728
Loading