High-level Reward Deep Reinforcement Learning Approach for a Novel Physical-Logical Hybrid Factory Line Robot Vehicle Simulation

Ryota Higa, Shinji Nakadai

Published: 2022, Last Modified: 15 May 2024CASE 2022EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: We propose a novel factory automation method that simultaneously optimizes a logical factory production line, such as inventory and production amount, and the physical path planning of robot vehicles. Traditionally, path planning for robot vehicles and overall factory line optimization have been studied independently. However, actual factory production lines require a use case for path planning that considers the balance between inventory control, production maximization, and coordination with assembly workers. Therefore, we developed a novel approach for the mobile simulation of a logistic-physical factory line and devised a deep reinforcement learning method based on high-level rewards. This method is capable of sequential path planning when considering the balance between the inventory and product number as well as the coordination of agents among the production lines. Consequently, our mobile agent successfully learns to plan the shortest route without a bottleneck in the factory production line during any given episode. Moreover, the mobile agent appropriately adjusts the route when a bottleneck occurs and inventory is excessive. This suggests that path planning for robot vehicle agents can be achieved based on indicators that optimize the entire factory, which we expect to be a novel application of robot vehicle coordination that operates along the entire production line.