Abstract: Current reinforcement learning algorithms struggle to quickly adapt to new situations without large amounts of experience and usually without large amounts of optimization over that experience. In this work we seek to leverage meta-learning methods from MAML with model-based RL methods based on MuZero to design agents which can quickly adapt online. We propose a new model-based meta-RL algorithm that can adapt online to new experience and can be meta-trained without explicit task labels. Compared to prior model-based meta-learning methods, our work can scale to more visually complex image based environments with dynamics that change significantly over time and can handle the continual RL setting which has no episodic boundaries.
Proposed Reviewers: Justin Fu, email@example.com Avi Singh, firstname.lastname@example.org