Learning Intrinsic Rewards as a Bi-Level Optimization Problem

Bradly C. Stadie, Lunjun Zhang, Jimmy Ba

2020 (modified: 03 Nov 2022)UAI 2020Readers: Everyone

Abstract: We reinterpret the problem of finding intrinsic rewards in reinforcement learning (RL) as a bilevel optimization problem. Using this interpretation, we can make use of recent advancements in the hy...

0 Replies