Keywords: Robot Learning, Reinforcement Learning, Fine-Tuning
TL;DR: We use value functions trained in simulation to guide efficient exploration for efficient real-world finetuning, with robot hardware and theoretical results
Abstract: Robot learning requires a considerable amount of data to realize the promise of generalization. However, it can be challenging to actually collect the magnitude of high-quality data necessary for generalization entirely in the real world. Simulation can serve as a source of plentiful data, wherein techniques such as reinforcement learning can obtain broad coverage over states and actions. However, high-fidelity physics simulators are fundamentally misspecified approximations to reality, making direct zero-shot transfer challenging, especially in tasks where precise and forceful manipulation is necessary. This makes real-world fine-tuning of policies pretrained in simulation an attractive approach to robot learning. However, exploring the real-world dynamics with standard RL fine-tuning techniques is to inefficient for many real-world applications. This paper introduces Simulation-Guided Fine-Tuning, a general framework which leverages the structure of the simulator to guide exploration, substantially accelerating adaptation to the real-world. We demonstrate our approach across several manipulation tasks in the real world, learning successful policies for problems that are challenging to learn using purely real-world data. We further provide theoretical backing for the paradigm.
Supplementary Material: zip
Primary Area: applications to robotics, autonomy, planning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 12648
Loading