On the Feasibility of Cross-Task Transfer with Model-Based Reinforcement LearningDownload PDF

Anonymous

22 Sept 2022, 12:35 (modified: 17 Nov 2022, 06:32)ICLR 2023 Conference Blind SubmissionReaders: Everyone
Keywords: model-based reinforcement learning, visual reinforcement learning
TL;DR: We investigate the feasibility of pretraining and cross-task transfer in model-based RL, and improve sample-efficiency substantially over baselines on the Atari100k benchmark.
Abstract: Reinforcement Learning (RL) algorithms can solve challenging control problems directly from image observations, but they often require millions of environment interactions to do so. Recently, model-based RL algorithms have greatly improved sample-efficiency by concurrently learning an internal model of the world, and supplementing real environment interactions with imagined rollouts for policy improvement. However, learning an effective model of the world from scratch is challenging, and in stark contrast to humans that rely heavily on world understanding and visual cues for learning new skills. In this work, we investigate whether internal models learned by modern model-based RL algorithms can be leveraged to solve new, distinctly different tasks faster. We propose Model-Based Cross-Task Transfer (XTRA), a framework for sample-efficient online RL with scalable pretraining and finetuning of learned world models. By proper pretraining and concurrent cross-task online fine-tuning, we achieve substantial improvements over a baseline trained from scratch; we improve mean performance of model-based algorithm EfficientZero by $23\%$, and by as much as $71\%$ in some instances.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Submission Guidelines: Yes
Please Choose The Closest Area That Your Submission Falls Into: Reinforcement Learning (eg, decision and control, planning, hierarchical RL, robotics)
13 Replies

Loading