The Body is not a Given: Joint Agent Policy Learning and Morphology Evolution

Dylan Banarse; Yoram Bachrach; Siqi Liu; Chrisantha Fernando; Nicolas Heess; Pushmeet Kohli; Guy Lever; Thore Graepel

The Body is not a Given: Joint Agent Policy Learning and Morphology Evolution

Dylan Banarse, Yoram Bachrach, Siqi Liu, Chrisantha Fernando, Nicolas Heess, Pushmeet Kohli, Guy Lever, Thore Graepel

27 Sept 2018 (modified: 05 May 2023)ICLR 2019 Conference Withdrawn SubmissionReaders: Everyone

Abstract: Reinforcement learning (RL) has proven to be a powerful paradigm for deriving complex behaviors from simple reward signals in a wide range of environments. When applying RL to continuous control agents in simulated physics environments, the body is usually considered to be part of the environment. However, during evolution the physical body of biological organisms and their controlling brains are co-evolved, thus exploring a much larger space of actuator/controller configurations. Put differently, the intelligence does not reside only in the agent's mind, but also in the design of their body. We propose a method for uncovering strong agents, consisting of a good combination of a body and policy, based on combining RL with an evolutionary procedure. Given the resulting agent, we also propose an approach for identifying the body changes that contributed the most to the agent performance. We use the Shapley value from cooperative game theory to find the fair contribution of individual components, taking into account synergies between components. We evaluate our methods in an environment similar to the the recently proposed Robo-Sumo task, where agents in a 3D environment with simulated physics compete in tipping over their opponent or pushing them out of the arena. Our results show that the proposed methods are indeed capable of generating strong agents, significantly outperforming baselines that focus on optimizing the agent policy alone. A video is available at: www.youtube.com/watch?v=eei6Rgom3YY

Keywords: Reinforcement Learning, Continuous Control, Evolutionary Computation, Genetic Algorithms, Evolving Morphology, Baldwin Effect, Population Based Training

TL;DR: Evolving the shape of the body in RL controlled agents improves their performance (and help learning)

6 Replies

Loading