Learning to Solve Capacitated Arc Routing Problems by Policy Gradient

Han Li, Guiying Li

Published: 2019, Last Modified: 13 Nov 2024CEC 2019EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Most heuristic algorithms for NP-hard combinatorial optimization problems require expertise in both the problem domains and heuristic methods. Recent research has begun to apply Deep Neural Network to learning heuristics for combinatorial optimization problems automatically. These works mainly focus problems with simple formulations, such as Travelling Salesman Problem and Vehicle Routing Problem defined on Euclidean graphs. This paper presents a novel deep reinforcement learning based algorithm for the Capacitated Arc Routing Problem which is defined on more complex non-Euclidean information graphs. The proposed approach is a combination of a Graph Convolutional Network and two encoder-decoder models. By regrading the negative objective values of CARP instances as the rewards, the proposed method optimizes the parameters with REINFORCE algorithm. In empirical experiments, the proposed method is able to generate solutions approximate optimal solutions well with much less time than heuristic algorithms.