RF-DROPO: Data-Efficient Adaptive Domain Randomization for Zero-Shot Sim-to-Real Transfer in Soft Robotics
Keywords: Soft Robotics, Reinforcement Learning, Transfer Learning, Sim-to-Real Transfer, Domain Randomization
TL;DR: We infer a simulator-parameter distribution from a few real trajectories without requiring full-state resets, then use it for domain-randomized policy learning and zero-shot transfer on a real soft robot.
Abstract: Simulation is a practical tool for training control policies for soft robots, but transferring these policies to real systems remains difficult because soft-body dynamics are hard to model accurately and only partially observable in practice. We present RF-DROPO, an offline adaptive domain randomization method that uses a small amount of real-world trajectory data to infer a distribution over simulator parameters, which is then used to train reinforcement learning policies in simulation. Unlike prior approaches that rely on full-state observability or simulator resets to arbitrary intermediate states, RF-DROPO performs reset-free trajectory matching from a shared initial condition and progressively increases the rollout horizon during inference. This makes the method well suited to deformable robotic systems, where accurate state reconstruction is often unavailable and data collection is expensive. We evaluate the approach on soft-robot control tasks including reaching and pushing, and show that the inferred parameter distributions support more reliable transfer than static domain randomization and existing adaptive baselines. We also report zero-shot deployment on a physical soft robot. Overall, the results suggest that lightweight simulator adaptation can substantially improve the practicality of sim-to-real policy learning for deformable robots.
Submission Number: 22
Loading