Flow-based Maximum Entropy Domain Randomization for Multi-step Assembly

Aidan Curtis; Eric Li; Michael Noseworthy; Nishad Gothoskar; Sachin Chitta; Hui Li; Leslie Pack Kaelbling; Nicole E Carey

Flow-based Maximum Entropy Domain Randomization for Multi-step Assembly

Aidan Curtis, Eric Li, Michael Noseworthy, Nishad Gothoskar, Sachin Chitta, Hui Li, Leslie Pack Kaelbling, Nicole E Carey

22 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Reinforcement Learning, Domain Randomization, Uncertainty, Assembly, Planning

TL;DR: We learn a neural sampling distribution for maximum-entropy domain randomization and use it for uncertainty-aware multi-step robotic assembly problems.

Abstract: Domain randomization in reinforcement learning is an established technique for increasing the robustness of control policies learned in simulation. By randomizing properties of the environment during training, the learned policy can be conformant to uncertainty along the randomized dimensions. While the environment distribution is typically specified by hand, in this paper we investigate the problem of automatically discovering this sampling distribution via entropy-regularized reward maximization of a neural sampling distribution in the form of a normalizing flow. We show that this architecture is more flexible and results in better robustness than existing approaches to learning simple parameterized sampling distributions. We demonstrate that these policies can be used to learn robust policies for contact-rich assembly tasks. Additionally, we explore how these sampling distributions can be used for out-of-distribution detection in the context of an uncertainty-aware multi-step manipulation planner.

Supplementary Material: zip

Primary Area: reinforcement learning

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 2708

Loading