Increasing Data Efficiency of Driving Agent By World ModelDownload PDF

Dec 14, 2020 (edited Dec 26, 2020)CUHK 2021 Course IERG5350 Blind SubmissionReaders: Everyone
  • Keywords: reinforcement learning
  • Abstract: Reinforcement learning algorithms for real-world autonomous driving must be able to handle complex, unknown dynamical systems. This requirement is han- dled well by model-free algorithm such as PPO. However, model-free approach tend to be substantially less sample-efficient. In this work, we aim to retain the advantages of model-free method and increase the stability and data-efficiency of PPO. To this end we proposed a world model to model popular reinforcement learning environments through compressed spatio-temporal representations, which allow model-free method learning behaviors from imagined outcomes to increase sample-efficiency. The experimental results indicate that our approach mitigating the inefficiency of PPO, increasing the stability, and largely reducing the train- ing time. code is available at The video can be found here.
3 Replies