Scheduling of Robotic Cellular Manufacturing Systems with Timed Petri Nets and Reinforcement Learning

Zhutao Yao, Bo Huang, Jianyong Lv, Xiaoyu Sean Lu, Meiji Cui, Shaohua Yu

Published: 2024, Last Modified: 15 May 2025IROS 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: This paper proposes a new Petri-net-based Q-learning scheduling method to schedule robotic cellular manufacturing (RCM) systems efficiently. First, we use generalized and place-timed Petri nets to model RCM systems. Then, we design a reinforcement learning method with a sparse Q-table to evaluate state-transition pairs of the net’s reachability graph. It uses the negative transition firing time as a reward for an action selection and adopts a large penalty for any encountered deadlock. In addition, it balances the state space exploration and the experience exploitation by using a dynamic ϵ-greedy policy to update the state values with an accumulative reward. Three different dynamic ϵ-greedy policies are designed for different application scenarios. Some benchmark RCM systems are tested with the proposed method and several popular PN-based online dispatching rules, such as FIFO and SRPT. Simulation results demonstrate that our method schedules RCM systems as quickly as the online dispatching rules while outperforming them in terms of schedule makespan. For readers’ reference, our source code and test data are available at https://github.com/PNOptimizer/PNQL.