Abstract: Kalman fiter is an efficient way to estimate the parameters of the value function in reinforcement learning. In order to solve Markov Decision Process (MDP) problems in both continuous state and action space, a new online reinforcement learning algorithm using Kalman filter technique, which is called Kalman filter-based actor-critic (KAC) learning is proposed in this paper. To implement the KAC algorithm, Cerebellar Model Articulation Controller (CMAC) neural networks are used to approximate the value function and the policy function respectively. Kalman filter is used to estimate the weights of the critic network. Two benchmark problems, namely the cart-pole balancing problem and the acrobot swing-up problem are provided to verify the effectiveness of the KAC approach. Experimental results demonstrate that the proposed KAC algorithm is more efficient than other similar algorithms.
0 Replies
Loading