Keywords: Graph neural network, Combinatorial optimization, Reinforcement learning
Abstract: There have been increasing challenges to solve combinatorial optimization problems by machine learning.
Khalil et al. (NeurIPS 2017) proposed an end-to-end reinforcement learning framework, which automatically learns graph embeddings to construct solutions to a wide range of problems.
However, it sometimes performs poorly on graphs having different characteristics than training graphs.
To improve its generalization ability to various graphs, we propose a novel learning strategy based on AlphaGo Zero, a Go engine that achieved a superhuman level without the domain knowledge of the game.
We redesign AlphaGo Zero for combinatorial optimization problems, taking into account several differences from two-player games.
In experiments on five NP-hard problems such as {\sc MinimumVertexCover} and {\sc MaxCut}, our method, with only a policy network, shows better generalization than the previous method to various instances that are not used for training, including random graphs, synthetic graphs, and real-world graphs.
Furthermore, our method is significantly enhanced by a test-time Monte Carlo Tree Search which makes full use of the policy network and value network.
We also compare recently-developed graph neural network (GNN) models, with an interesting insight into a suitable choice of GNN models for each task.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
One-sentence Summary: We train graph representation for combinatorial optimization problems without domain knowledge.
Supplementary Material: zip
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 2 code implementations](https://www.catalyzex.com/paper/arxiv:1905.11623/code)
Reviewed Version (pdf): https://openreview.net/references/pdf?id=3KWviO1tQ
7 Replies
Loading