Focus on Primary: Differential Diverse Data Augmentation for Generalization in Visual Reinforcement Learning

21 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: reinforcement learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: Visual Reinforcement Learning, Data Augmentation, Generalization
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Abstract: In reinforcement learning, it is common for the agent to overfit the training environment, making generalization to unseen environments extremely challenging. Visual reinforcement learning that relies on observed images as input is particularly constrained by generalization and sample efficiency. To address these challenges, various data augmentation methods are consistently attempted to improve the generalization capability and reduce the training cost. However, the naive use of data augmentation can often lead to breakdowns in learning. In this paper, we propose two novel approaches: Diverse Data Augmentation (DDA) and Differential Diverse Data Augmentation (D3A). Leveraging a pre-trained encoder-decoder model, we segment primary pixels to avoid inappropriate data augmentation affecting critical information. DDA improves the generalization capability of the agent in complex environments through consistency of encoding. D3A uses proper data augmentation for primary pixels to further improve generalization while satisfying semantic-invariant state transformation. We extensively evaluate our methods on a series of generalization tasks of DeepMind Control Suite. The results demonstrate that our methods significantly improve the generalization performance of the agent in unseen environments, and enable the selection of more diverse data augmentations to improve the sample efficiency of off-policy algorithms.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
Supplementary Material: zip
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 3332
Loading