Leveraging exploration in off-policy algorithms via normalizing flows

Bogdan Mazoure, Thang Doan, Audrey Durand, Joelle Pineau, R. Devon Hjelm

2019 (modified: 09 Sept 2021)CoRL 2019Readers: Everyone

Abstract: The ability to discover approximately optimal policies in domains with sparse rewards is crucial to applying reinforcement learning (RL) in many real-world scenarios. Approaches such as neural dens...

0 Replies