- Keywords: reinforcement learning, multi-robot/machine, scheduling, planning, scalability, transferability, mean-field inference, graph embedding
- TL;DR: RL can solve (stochastic) multi-robot/scheduling problems scalably and transferably using graph embedding
- Abstract: Can the success of reinforcement learning methods for simple combinatorial optimization problems be extended to multi-robot sequential assignment planning? In addition to the challenge of achieving near-optimal performance in large problems, transferability to an unseen number of robots and tasks is another key challenge for real-world applications. In this paper, we suggest a method that achieves the first success in both challenges for robot/machine scheduling problems. Our method comprises of three components. First, we show any robot scheduling problem can be expressed as a random probabilistic graphical model (PGM). We develop a mean-field inference method for random PGM and use it for Q-function inference. Second, we show that transferability can be achieved by carefully designing two-step sequential encoding of problem state. Third, we resolve the computational scalability issue of fitted Q-iteration by suggesting a heuristic auction-based Q-iteration fitting method enabled by transferability we achieved. We apply our method to discrete-time, discrete space problems (Multi-Robot Reward Collection (MRRC)) and scalably achieve 97% optimality with transferability. This optimality is maintained under stochastic contexts. By extending our method to continuous time, continuous space formulation, we claim to be the first learning-based method with scalable performance in any type of multi-machine scheduling problems; our method scalability achieves comparable performance to popular metaheuristics in Identical parallel machine scheduling (IPMS) problems.
- Code: https://drive.google.com/drive/folders/1vUAR7OmNxCi9MYV3Kwo3RuxHBsemZ-rA?usp=sharing
- Original Pdf: pdf