HuRi : Humanoid Robots Adaptive Risk-ware Distributional Reinforcement Learning for Robust Control

junlong wu; Yi Cheng; Hang Liu; Houde Liu; Xueqian Wang; Bin Liang

HuRi : Humanoid Robots Adaptive Risk-ware Distributional Reinforcement Learning for Robust Control

junlong wu, Yi Cheng, Hang Liu, Houde Liu, Xueqian Wang, Bin Liang

28 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Adaptive Risk-Aware, Distributional Reinforcement Learning, Humanoid Robots, Locomotion Control

TL;DR: HuRi measures uncertainty of the environment by using inputs and calculated probability distributions, and combines IQR and RND with Dist. RL to adjust the risk sensitivity level of the agent.

Abstract: Due to the high complexity of bipedal locomotion, the locomotion control of humanoid robots requires precise adjustment of the balance system to adapt to the varying environment conditions. In the past, few studies have explicitly incorporated risk factors into robot policy training, and lacked the ability to adaptively adjust the risk sensitivity for different risky environment conditions. This deficiency impacts the agent’s exploration during training and thus fail to select the optimal action in the risky environment. We propose an adaptive risk-aware policy(HuRi) based on distributional reinforcement learning. In Dist. RL, the policy control the risk sensitivity by employing different distortion measure of the esitimated return distribution. HuRi is capable of dynamically selecting the risk sensitivity level in varying environmental conditions by utilizing the Inter Quartile Range to measure intrinsic uncertainty and Random Network Distillation for assessing the parameter uncertainty of the environment. This algorithm allows the agent to conduct safe and efficient exploration in hazardous environments during training, enhancing the mobility of humanoid robots. Simulations and real-world deployments on the Zerith-1 robot have been conducted to confirm the robustness of HuRi.

Supplementary Material: zip

Primary Area: applications to robotics, autonomy, planning

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 14098

Loading