Revisiting the Softmax Bellman Operator: New Benefits and New Perspective

Zhao Song, Ronald Parr, Lawrence Carin

2019 (modified: 11 Nov 2022)ICML 2019Readers: Everyone

Abstract: The impact of softmax on the value function itself in reinforcement learning (RL) is often viewed as problematic because it leads to sub-optimal value (or Q) functions and interferes with the contr...

0 Replies