Keywords: meta learning, world models, reinforcement learning, model based reinforcement learning
Abstract: We propose Meta-World Conditional Neural Processes (MW-CNP), a conditional world model generator that leverages sample efficiency and scalability of Conditional Neural Processes to allow an agent to sample from the generated world model. We intend to reduce the agent's interaction with the target environment as much as possible. Thus, MW-CNP meta-learns world models that use prior experience. Using the world model generated from MW-CNP the RL agent can be conditioned on significantly fewer samples collected from the target environment to imagine the unseen environment. We emphasize that the agent does not have access to the task parameters throughout training and testing.
TL;DR: Generating world models for fast adaptation to an unseen target environment where the target environment parameters and reward signals are hidden.