Using Reinforcement Learning for Per-Instance Algorithm Configuration on the TSP

Published: 01 Jan 2023, Last Modified: 30 Sept 2024SSCI 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Automated Algorithm Configuration (AAC) usually takes a global perspective: it identifies a parameter configuration for an (optimization) algorithm that maximizes a performance metric over a set of instances. However, the optimal choice of parameters strongly depends on the instance at hand and should thus be calculated on a per-instance basis. We explore the potential of Per-Instance Algorithm Configuration (PIAC) by using Reinforcement Learning (RL). To this end, we propose a novel PIAC approach that is based on deep neural networks. We apply it to predict configurations for the Lin-Kernighan heuristic (LKH) for the Traveling Salesperson Problem (TSP) individually for every single instance. To train our PIAC approach, we create a large set of 100 000 TSP instances with 2 000 nodes each - currently the largest benchmark set to the best of our knowledge. We compare our approach to the state-of-the-art AAC method Sequential Model-based Algorithm Configuration (SMAC). The results show that our PIAC approach outperforms this baseline on both the newly created instance set and established instance sets.
Loading