Solving NP-Hard Problems on Graphs with Extended AlphaGo Zero

28 Sept 2020 (modified: 22 Oct 2023)ICLR 2021 Conference Blind SubmissionReaders: Everyone
Keywords: Graph neural network, Combinatorial optimization, Reinforcement learning
Abstract: There have been increasing challenges to solve combinatorial optimization problems by machine learning. Khalil et al. (NeurIPS 2017) proposed an end-to-end reinforcement learning framework, which automatically learns graph embeddings to construct solutions to a wide range of problems. However, it sometimes performs poorly on graphs having different characteristics than training graphs. To improve its generalization ability to various graphs, we propose a novel learning strategy based on AlphaGo Zero, a Go engine that achieved a superhuman level without the domain knowledge of the game. We redesign AlphaGo Zero for combinatorial optimization problems, taking into account several differences from two-player games. In experiments on five NP-hard problems such as {\sc MinimumVertexCover} and {\sc MaxCut}, our method, with only a policy network, shows better generalization than the previous method to various instances that are not used for training, including random graphs, synthetic graphs, and real-world graphs. Furthermore, our method is significantly enhanced by a test-time Monte Carlo Tree Search which makes full use of the policy network and value network. We also compare recently-developed graph neural network (GNN) models, with an interesting insight into a suitable choice of GNN models for each task.
One-sentence Summary: We train graph representation for combinatorial optimization problems without domain knowledge.
