Flow-based Domain Randomization for Learning and Sequencing Robotic Skills

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
TL;DR: A normalizing-flow approach to domain randomization yields robust RL policies for multi-step robot manipulation under uncertainty.
Abstract: Domain randomization in reinforcement learning is an established technique for increasing the robustness of control policies learned in simulation. By randomizing properties of the environment during training, the learned policy can be robust to uncertainty along the randomized dimensions. While the environment distribution is typically specified by hand, in this paper we investigate the problem of automatically discovering this sampling distribution via entropy-regularized reward maximization of a neural sampling distribution in the form of a normalizing flow. We show that this architecture is more flexible and results in better robustness than existing approaches to learning simple parameterized sampling distributions. We demonstrate that these policies can be used to learn robust policies for contact-rich assembly tasks. Additionally, we explore how these sampling distributions, in combination with a privileged value function, can be used for out-of-distribution detection in the context of an uncertainty-aware multi-step manipulation planner.
Lay Summary: Robots often learn new skills in computer simulations, but what they learn does not always work well in the real world. One way to fix this is by making the training environments more varied, so the robot gets used to different situations. However, these variations are usually chosen by people, which can be time-consuming and not always effective. In this work, we introduce GoFlow, a new method that teaches robots in a smarter way by automatically creating a wide range of helpful training situations. This leads to better learning and makes the robot more prepared for the real world. We tested GoFlow in both virtual and real tasks and found that it helped robots succeed more often. GoFlow can also help the robot know when it needs more information before making a move, which makes it safer and more reliable.
Link To Code: https://github.com/aidan-curtis/goflow
Primary Area: Reinforcement Learning->Planning
Keywords: Reinforcement Learning, Domain Randomization, Uncertainty, Assembly, Planning
Submission Number: 4041
Loading