Abstract: Experience replay is a promising approach to improve the learning efficiency of adaptive dynamic programming. A general model-free adaptive dynamic programming (ADP) approach with the experience replay technology is investigated in this paper to solve the optimal control problems in continuous state and action spaces. Both the critic network and action network are modeled with a feedforward neural network with one hidden layer. During the learning process, a number of recently observed data samples are recorded in a database. When updating the parameters of the neural networks, the data in the sample database are repeatedly used to update the weights of the action network and the critic network. Implementation details of the algorithm are given, and simulation experiments are utilized to verify the learning efficiency of the proposed approach.
0 Replies
Loading