A Game-Theoretic Approach for Improving Generalization Ability of TSP Solvers

Chenguang Wang; Yaodong Yang; Oliver Slumbers; Congying Han; Tiande Guo; Haifeng Zhang; Jun Wang

A Game-Theoretic Approach for Improving Generalization Ability of TSP Solvers

Chenguang Wang, Yaodong Yang, Oliver Slumbers, Congying Han, Tiande Guo, Haifeng Zhang, Jun Wang

Published: 25 Apr 2022, Last Modified: 05 May 2023ICLR 2022 Workshop on Gamification and Multiagent SolutionsReaders: Everyone

Keywords: Policy Space Response Oracle, Combinatorial Optimization Problem, Generalization Ability

TL;DR: Introducing a game-theoretic training framework which aims to improve the generalization ability of deep learning-based TSP solvers

Abstract: In this paper, we introduce a two-player zero-sum framework between a trainable \emph{Solver} and a \emph{Data Generator} to improve the generalization ability of deep learning-based solvers for Traveling Salesman Problems (TSP). Grounded in \textsl{Policy Space Response Oracle} (PSRO) methods, our two-player framework outputs a population of best-responding Solvers, over which we can mix and output a combined model that achieves the least exploitability against the Generator, and thereby the most generalizable performance on different TSP tasks. We conduct experiments on a variety of TSP instances with different types and sizes. Results suggest that our Solvers achieve the state-of-the-art performance even on tasks the Solver never meets, whilst the performance of other deep learning-based Solvers drops sharply due to over-fitting. To demonstrate the principle of our framework, we study the learning outcome of the proposed two-player game and demonstrate that the exploitability of the Solver population decreases during training, and it eventually approximates the Nash equilibrium along with the Generator.

1 Reply

Loading