RP-DQN: An Application of Q-Learning to Vehicle Routing Problems

Ahmad Bdeir, Simon Boeder, Tim Dernedde, Kirill Tkachuk, Jonas K. Falkner, Lars Schmidt-Thieme

Published: 2021, Last Modified: 10 May 2023KI 2021Readers: Everyone

Abstract: In this paper we present a new approach to tackle complex routing problems with an improved state representation that utilizes the model complexity better than previous methods. We enable this by training from temporal differences. Specifically Q-Learning is employed. We show that our approach achieves state-of-the-art performance for autoregressive policies that sequentially insert nodes to construct solutions on the Capacitated Vehicle Routing Problem (CVRP). Additionally, we are the first to tackle the Multiple Depot Vehicle Routing Problem (MDVRP) with Reinforcement Learning (RL) and demonstrate that this problem type greatly benefits from our approach over other Machine Learning (ML) methods.

0 Replies