Offline Model-Based Skill Stitching

27 Sept 2024 (modified: 05 Dec 2024)ICLR 2025 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Skill stitching, Offline reinforcement learning, Model-based planning
Abstract: We study building agents capable of solving long-horizon tasks using offline model-based reinforcement learning (RL). Existing RL methods effectively learn individual skills. However, seamlessly combining these skills to tackle long-horizon tasks presents a significant challenge, as the termination state of one skill may be unsuitable for initiating the next skill, leading to cumulative distribution shifts. Previous works have studied skill stitching through online RL, which is time-consuming and raises safety concerns when learning in the real world. In this work, we propose a fully offline approach to learn skill stitching. Given that the aggregated datasets from all skills provide diverse and exploratory data, which likely includes the necessary transitions for stitching skills, we train a dynamics model designed to generalize across skills to facilitate this process. Our method employs model predictive control (MPC) to stitch adjacent skills, using an ensemble of offline dynamics models and value functions. To mitigate overestimation issues inherent in models learned offline, we introduce a conservative approach that penalizes the uncertainty in model and value predictions. Our experimental results across various benchmarks validate the effectiveness of our approach in comparison to baseline methods under offline settings.
Primary Area: reinforcement learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 10675
Loading