Augmenting Offline Reinforcement Learning with State-only Interactions

26 Sept 2024 (modified: 16 Dec 2024)ICLR 2025 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Offline Reinforcement Learning, Generative Model, Data Augmentation, Trajectory Stitching
TL;DR: We proposed a novel data augmentation method DITS for offline RL, where state-only interactions are available with the environment.
Abstract: Batch offline data have been shown considerably beneficial for reinforcement learning. Their benefit is further amplified by upsampling with generative models. In this paper, we consider a novel opportunity where interaction with environment is feasible, but only restricted to observations, i.e. *no reward* feedback is available. This setting is realistic, because simulators or even real cyber-physical systems are often accessible, while in contrast reward is often difficult or expensive to obtain, similar to imitation learning settings. As a result, the learner must make best sense of the offline data to synthesize the most sample-efficient scheme of querying the transition of observation. Our method first leverages online interactions to generate high-return trajectories via conditional diffusion models. They are then blended with the original offline trajectories through a stitching algorithm, and the resulting augmented data is applied to downstream reinforcement learner. Superior empirical performance is demonstrated over state-of-the-art data augmentation methods that are extended to utilize observation-only interactions.
Supplementary Material: zip
Primary Area: reinforcement learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 8135
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview