Learning Scalable Physics-based Motion Skills with Reinforcement Learning

Tingwu Wang

Published: 2022, Last Modified: 27 Feb 2026undefined 2022EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: The fast development of deep neural networks and reinforcement learning (RL) has enabled a large number of applications that generate motions in robotics, video games and movies.However, current motion controllers are task and agent dependent, and are sensitive to agents' small dynamical changes. Their performance also degenerates with high dimensional agents such as humanoids. Therefore existing research methods cannot scale to real-life applications. In this thesis, we present a series of research works that improve the scalability of physics-based motion controllers. First, we identify one of the potential causes of the scalability gap as the task-dependent objective modeling and reward design.We propose UniCon, which instead is a two level framework that disentangles high-level behaviors and low-level motion skills. By developing multi-task objective functions and constraints, UniCon enables learning of thousands of motion skills with a single neural controller, and can be used for various interactive applications in a plug-and-play fashion. The second line of research uses structured prior in RL, which aims to improve the scalability of modeling agents with different topologies and structures.We propose NerveNet, which models the agent as a graph and learns a structured policy with graph neural networks. NerveNet greatly increases the scalability and transferability of the learnt features between different agents, demonstrating zero-shot performance across agents with similar structures. Following NerveNet, we further propose neural graph evolution (NGE).NGE improves the graph modeling which scales and shares the graph controllers to agents with arbitrary structures. NGE empowers the search optimizer to find the optimal graph structures and policies jointly. We demonstrate the effectiveness of NGE with the task of robot search, where NGE successfully evolves effective agents much faster and more efficiently. Our final contribution is GraphCon, where the insights from the aforementioned research are combined.We utilize the RL formulation from UniCon, and the graph policy modeling from NerveNet and NGE. We show that GraphCon generates realistic physics-based control with the scalability across a large variety of motions and characters, and models object interactions naturally with the graphs.

External IDs:dblp:phd/ca/Wang22b