A Game-Theoretic Approach for Improving Generalization Ability of TSP Solvers

Chenguang Wang; Yaodong Yang; Congying Han; Tiande Guo; Haifeng Zhang; Jun Wang

A Game-Theoretic Approach for Improving Generalization Ability of TSP Solvers

Chenguang Wang, Yaodong Yang, Congying Han, Tiande Guo, Haifeng Zhang, Jun Wang

29 Sept 2021 (modified: 13 Feb 2023)ICLR 2022 Conference Withdrawn SubmissionReaders: Everyone

Keywords: Combinatorial Optimization Problem, Policy Space Response Oracle, Reinforcement Learning

Abstract: In this paper, we shed new light on the study of how to improve the generalization ability of deep learning-based solvers for the Traveling Salesman Problem (TSP). We build a two-player zero-sum game between a trainable solver and a task generator, where the solver aims to solve instances provided by the generator, and the generator aims to generate increasingly difficult instances for the solver. Grounded in the \textsl{Policy Space Response Oracle} (PSRO) framework, our two-player framework allows us to obtain a behaviourally diverse population of powerful solvers over which we utilise a model mixing method to combine these solvers and achieve strong generalization ability on various tasks. Experimentally, we achieve the state-of-the-art results on a general TSP instance generation method over which the performance of other deep learning-based methods degenerates vastly. On realistic instances from TSPLib we approximately attain a \textbf{12\%} improvement over the base model. Furthermore, we empirically illustrate as the solvers' performance improves, the obtained strategy's exploitability keeps decreasing showing gradual convergence to the Nash equilibrium.

One-sentence Summary: Introducing a game-theoretic training framework which aims to improve the generalization ability of deep learning-based TSP solvers.

5 Replies

Loading