Reinforcement Learning Control of a Physical Robot Device for Assisted Human Walking without a Simulator

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
Abstract: This study presents an innovative reinforcement learning (RL) control approach to facilitate soft exosuit-assisted human walking. Our goal is to address the ongoing challenges in developing reliable RL-based methods for controlling physical devices. To overcome key obstacles—such as limited data, the absence of a simulator for human-robot interaction during walking, the need for low computational overhead in real-time deployment, and the demand for rapid adaptation to achieve personalized control while ensuring human safety—we propose an online Adaptation from an offline Imitating Expert Policy (AIP) approach. Our offline learning mimics human expert actions through real human walking demonstrations without robot assistance. The resulted policy is then used to initialize online actor-critic learning, the goal of which is to optimally personalize robot assistance. In addition to being fast and robust, our online RL method also posses important properties such as learning convergence, dynamic stability, and solution optimality. We have successfully demonstrated our simple and robust framework for safe robot control on all five tested human participants, without selectively presenting results. The qualitative performance guarantees provided by our online RL, along with the consistent experimental validation of AIP control, represent the first demonstration of online adaptation for softsuit control personalization and serve as important evidence for the use of online RL in controlling a physical device to solve a real-life problem.
Lay Summary: Soft robotic exosuits have been developed aiming to assist human walking with reduced effort. But some real-world challenges need to be addressed for their deployment. Unlike rigid exoskeletons, soft exosuits are made of flexible materials that are comfortable to wear but difficult to model, simulate and control. Additionally, it requires personalization for individual users in order to fully utilize the device to its advantage. For control purposes, a prominent issue is the time delay in soft actuators, which introduces additional challenge to reliable and real-time control of the device customized to individuals. This study addresses the above challenges by introducing a novel method Adaptation from an Imitating Policy (AIP), which learns to control the exosuit directly on a human user. Instead of relying on extensive simulations or a massive dataset, both of which are not available to this application, AIP creates an offline policy first and then adapt it in real time to a new user. AIP is designed to handle environmental noise, actuator delays, and inherent inter- and intra- person variability during movement. We tested our method with 5 human users. Results show reduced walking effort while the device adapting safely to different individuals. Our system offers learning stability, user safety, and improved performance without the need of a simulator—bringing soft wearable robots closer to practical use in daily life.
Application-Driven Machine Learning: This submission is on Application-Driven Machine Learning.
Link To Code: https://github.com/JennieSi-Lab-RLOC/ICML2025-AIP
Primary Area: Applications->Robotics
Keywords: Soft Robotics, Human robotic interaction, reinforcement learning without simulator
Submission Number: 14708
Loading