A Deep Reinforcement Learning Heuristic for SAT based on Antagonist Graph Neural Networks

Thomas Fournier, Arnaud Lallouet, Télio Cropsal, Gaël Glorian, Alexandre Papadopoulos, Antoine Petitet, Guillaume Perez, Suruthy Sekar, Wijnand Suijlen

Published: 01 Jan 2022, Last Modified: 09 Apr 2024ICTAI 2022Readers: Everyone

Abstract: Heuristics are one of the most important tools to guide search to solve combinatorial problems. They are often specifically designed for one single problem and require both expertise and implementation work. Generic frameworks like SAT or CSP have developed heuristics that obey general principles like first fail or are able to learn and adapt from the exploration of the search tree like Dom/wDeg. In SAT, the classic VSIDS heuristic falls into both categories. The question of whether it is possible to learn from solving existing problems has been addressed for a long time by portfolio solvers where the best heuristic is chosen by Machine Learning from hand-crafted features, and more recently with Deep Learning by embedding this knowledge into a Graph Neural Network (GNN). In this paper, we build upon the latter category by proposing a new heuristic based on Deep Reinforcement Learning using two GNNs with adversarial rewards. We show that our method reduces the number of fails to get the first solution by more than 50% compared to MiniSat. This work shows the advantages of this type of techniques to extract structural and contextual knowledge from past solving experience.

0 Replies