Hierarchically Connecting Modularly-Learned Policies to Generate a Controller for a Combined Robot System

Sho Takeda, Satoshi Yamamori, Satoshi Yagi, Jun Morimoto

Published: 2025, Last Modified: 25 Jan 2026CASE 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Deep reinforcement learning offers a promising approach for controlling robots with high degrees of freedom. However, its application is limited by substantial data requirements and difficulties in simulating complex physical interactions. This paper proposes a novel component-wise hierarchical policy learning approach that addresses these challenges by decomposing the robot into individual components and learning a separate policy for each. This strategy improves data efficiency and allows for more focused training of individual sub-systems. Upper-level policies then integrate these component policies through a separate learning process, enabling coordinated control of the robot whole-body. This hierarchical structure can be interpreted as a curriculum learning strategy, where the robot gradually learns more complex tasks by mastering individual component skills first. By learning modular policies and then combining them, the approach offers improved generalization and robustness compared to monolithic policy learning. We validate the approach on a modular robot, demonstrating that this hierarchical, component-wise policy learning framework enables efficient control of complex robots.

External IDs:dblp:conf/case/TakedaYYM25