Keywords: Deep Reinforcement Learning, Imitation Learning, Multi-Embodiment, Design Optimization
TL;DR: Using Self-Imitation learning helps to increase the sample-efficiency when optimizing agent design and behaviour, even if embodiments are different.
Abstract: The task of co-optimizing the body and behaviour of agents has been a longstanding problem in the fields of evolutionary robotics and embodied AI. Previous work has largely focused on the development of learning methods exploiting massive parallelization of agent evaluations with large population sizes, a paradigm which is applicable to simulated agents but cannot be transferred to the real world
due to the assoicated costs with the production of embodiments and robots. Furthermore, recent data-efficient approaches utilizing
reinforcement learning can suffer from distributional shifts in transition dynamics as well as in state and action spaces when experiencing new body morphologies.
In this work, we propose a new co-adaptation method combining reinforcement learning and State-Aligned SelfImitation Learning to co-optimize embodiment and behavioural policies withing a handful of design iterations. We show that the integration of a self-imitation signal
improves the data-efficiency of the co-adaptation process as well as the behavioural recovery when adapting morphological parameters.
Primary Area: reinforcement learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 2709
Loading