A Learning-based Capacitated Arc Routing Problem Solver Comparable to Metaheuristics While with Far Less Runtimes

Runze Guo; Feng Xue; Nicu Sebe; Anlong Ming

A Learning-based Capacitated Arc Routing Problem Solver Comparable to Metaheuristics While with Far Less Runtimes

Runze Guo, Feng Xue, Nicu Sebe, Anlong Ming

12 May 2024 (modified: 06 Nov 2024)Submitted to NeurIPS 2024EveryoneRevisionsBibTeXCC BY-NC-SA 4.0

Keywords: Capacitated Arc Routing, Metaheuristics, Reinforcement learning

TL;DR: we introduce a learning-based CARP solver to significantly narrow the gap with advanced metaheuristics while with far less runtimes.

Abstract: Recently, neural networks (NN) have made great strides in combinatorial optimization problems (COPs). However, they face challenges in solving the capacitated arc routing problem (CARP) which is to find the minimum-cost tour that covers all required edges on a graph, while within capacity constraints. Actually, NN-based approaches tend to lag behind advanced metaheuristics due to complexities caused by non-Euclidean graph, traversal direction and capacity constraints. In this paper, we introduce an NN-based solver tailored for these complexities, which significantly narrows the gap with advanced metaheuristics while with far less runtimes. First, we propose the direction-aware attention model (DaAM) to in corporate directionality into the embedding process, facilitating more effective one-stage decision-making. Second, we design a supervised reinforcement learning scheme that involves supervised pre-training to establish a robust initial policy for subsequent reinforcement fine-tuning. It proves particularly valuable for solving CARP that has a higher complexity than the node routing problems (NRPs). Finally, a path optimization method is introduced to adjust the depot return positions within the path generated by DaAM. Experiments show that DaAM surpasses heuristics and achieves decision quality comparable to state-of-the-art metaheuristics for the first time while maintaining superior efficiency, even in large-scale CARP instances. The code and datasets are provided in the Appendix.

Primary Area: Reinforcement learning

Submission Number: 4943

Loading