2020 (modified: 24 Feb 2022)ICML 2020Readers: Everyone
Abstract:We propose a reward function estimation framework for inverse reinforcement learning with deep energy-based policies. We name our method PQR, as it sequentially estimates the Policy, the Q-function...