Contrastive Unsupervised Learning of World Model with Invariant Causal FeaturesDownload PDF

Published: 01 Feb 2023, Last Modified: 13 Feb 2023Submitted to ICLR 2023Readers: Everyone
Keywords: world models, causality, contrastive learning, model-based reinforcement learning, reinforcement learning, out-of-distribution generalisation, sim-to-real transfer, robot navigation
TL;DR: We present a world model, which learns the causal features using invariance principle and achieves state-of-the-art performance on out-of-distribution generalisation.
Abstract: In this paper we present a world model, which learns the causal features using invariance principle. We use contrastive unsupervised learning to learn the invariant causal features, which enforces invariance across augmentations of irrelevant parts or styles of the observation. Since the world model based reinforcement learning methods optimize representation learning and policy of the agent independently, contrastive loss collapses due to lack of supervisory signal to the representation learning module. We propose depth reconstruction as an auxiliary task to explicitly enforce the invariance and data augmentation as style intervention on the RGB space to mitigate this issue. Our design help us to leverage state-of-the-art unsupervised representation learning method to learn the world model with invariant causal features, which outperforms current state-of-the-art model-based as well as model-free reinforcement learning methods on out-of-distribution point navigation tasks on Gibson and iGibson dataset at 100k and 500k interaction step benchmarks. Further experiments on DeepMind control suite even without depth reconstruction, our proposed model performs on par with the state-of-the-art counterpart models.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Submission Guidelines: Yes
Please Choose The Closest Area That Your Submission Falls Into: Reinforcement Learning (eg, decision and control, planning, hierarchical RL, robotics)
13 Replies

Loading