Error Controlled Actor-Critic Method to Reinforcement Learning

Xingen Gao; Fei Chao; Changle Zhou; Zhen Ge; Chih-Min Lin; Longzhi Yang; Xiang Chang; Changjing Shang

Error Controlled Actor-Critic Method to Reinforcement Learning

Xingen Gao, Fei Chao, Changle Zhou, Zhen Ge, Chih-Min Lin, Longzhi Yang, Xiang Chang, Changjing Shang

28 Sept 2020 (modified: 05 May 2023)ICLR 2021 Conference Blind SubmissionReaders: Everyone

Keywords: reinforcement learning, actor-critic, function approximation, approximation error, KL divergence

Abstract: In the reinforcement learning (RL) algorithms which incorporate function approximation methods, the approximation error of value function inevitably cause overestimation phenomenon and have a negative impact on the convergence of the algorithms. To mitigate the negative effects of approximation error, we propose a new actor-critic algorithm called Error Controlled Actor-critic which ensures confining the approximation error in value function. In this paper, we firstly present an analysis of how the approximation error can hinder the optimization process of actor-critic methods. Then, we *derive an upper boundary of the approximation error of Q function approximator, and found that the error can be lowered by placing restrictions on the KL-divergence between every two consecutive policies during the training phase of the policy.* The results of experiments on a range of continuous control tasks from OpenAI gym suite demonstrate that the proposed actor-critic algorithm apparently reduces the approximation error and significantly outperforms other model-free RL algorithms.

One-sentence Summary: We propose a new actor-critic algorithm called Error Controlled Actor-critic which ensures confining the approximation error in value function.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Reviewed Version (pdf): https://openreview.net/references/pdf?id=iYAUv54ekm

19 Replies

Loading