MeMo: Meaningful, Modular Controllers Via Information Bottlenecks

22 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: transfer learning, meta learning, and lifelong learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: modular neural network policy, policy transfer, imitation learning, reinforcement learning
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
TL;DR: We present a method for pretraining modular controllers that significantly speed up RL training for locomotion and grasping when reused on more complex morphologies.
Abstract: Robots are often built from standardized assemblies, (e.g. arms, legs, or fingers), but each robot must be trained from scratch to control all the actuators of all the parts together. In this paper we demonstrate a new approach that takes a single robot and its controller as input and produces a set of modular controllers for each of these assemblies such that when a new robot is built from the same parts, its control can be quickly learned by reusing the modular controllers. We achieve this with a framework called MeMo which learns (Me)aningful, (Mo)dular controllers. Specifically, MeMo pretrains a modular architecture that assigns separate neural networks to physical substructures and uses an information bottleneck to learn an appropriate division of control information between the modules. We benchmark our framework in locomotion and grasping environments on challenging simple to complex robot morphology transfer. We also show that the modules help in task transfer. On both structure and task transfer, MeMo achieves improved training efficiency to pretrained graph neural network baselines. In particular, MeMo significantly improves training efficiency on structure transfer, often achieving 2x the training efficiency of the strongest baseline.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
Supplementary Material: zip
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 5326
Loading