A Multi-AGV Routing Planning Method Based on Deep Reinforcement Learning and Recurrent Neural Network

Yishuai Lin, Gang Hu, Liang Wang, Qingshan Li, Jiawei Zhu

Published: 2024, Last Modified: 06 Aug 2024IEEE CAA J. Autom. Sinica 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Dear Editor, This letter presents a multi-automated guided vehicles (AGV) routing planning method based on deep reinforcement learning (DRL) and recurrent neural network (RNN), specifically utilizing proximal policy optimization (PPO) and long short-term memory (LSTM). Compared to traditional AGV pathing planning methods using genetic algorithm, ant colony optimization algorithm, etc., our proposed method has a higher degree of adaptability to deal with temporary changes of tasks or sudden failures of AGVs. Furthermore, our novel routing method, which uses LSTM to take into account temporal step information, provides a more optimized performance in terms of rewards and convergence speed as compared to existing PPO-based routing methods for AGVs.