Reinforcement Learning Controller Design for Discrete-Time-Constrained Nonlinear Systems With Weight Initialization Method

Published: 01 Jan 2024, Last Modified: 29 Sept 2024IEEE Trans. Syst. Man Cybern. Syst. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Extensive research has been dedicated to reinforcement learning (RL) for acquiring proficient optimal controllers through interactions with the environment. However, real-world demands, including enhanced safety performance, introduce considerable challenges to the present design of optimal controllers rooted in RL algorithms. A novel approach is introduced in this article for designing RL-based optimal controllers, employing a control barrier function (CBF) alongside a nonquadratic loss function related to the control signal. The aim is to enable the agent to learn the optimal controller in a secure and efficient manner. To tackle the instability issue in neural network training inherent to traditional RL-based controller design processes, the nonlinear model predictive control (NMPC) technique is employed for initializing the controller network’s weights. A formal demonstration of the method’s optimality is presented. Numerical simulations validate the proposed approach, illustrating its capacity to effectively learn the optimal controller while adhering to the input and state constraints of the system.
Loading