PID Accelerated Value Iteration Algorithm

Amir Massoud Farahmand, Mohammad Ghavamzadeh

2021 (modified: 24 Sept 2022)ICML 2021Readers: Everyone

Abstract: The convergence rate of Value Iteration (VI), a fundamental procedure in dynamic programming and reinforcement learning, for solving MDPs can be slow when the discount factor is close to one. We pr...

0 Replies