Improving Learning to Branch via Reinforcement Learning

Haoran Sun; Wenbo Chen; Hui Li; Le Song

Improving Learning to Branch via Reinforcement Learning

Haoran Sun, Wenbo Chen, Hui Li, Le Song

28 Sept 2020 (modified: 05 May 2023)ICLR 2021 Conference Blind SubmissionReaders: Everyone

Keywords: Mixed Integer Programming, Branching and Bound, Strong Branching, Reinforcement Learning, Evolution Strategy, Novelty Search

Abstract: Branch-and-Bound~(B\&B) is a general and widely used algorithm paradigm for solving Mixed Integer Programming~(MIP). Recently there is a surge of interest in designing learning-based branching policies as a fast approximation of strong branching, a human-designed heuristic. In this work, we argue strong branching is not a good expert to imitate for its poor decision quality when turning off its side effects in solving linear programming. To obtain more effective and non-myopic policies than a local heuristic, we formulate the branching process in MIP as reinforcement learning~(RL) and design a policy characterization for the B\&B process to improve our agent by novelty search evolutionary strategy. Across a range of NP-hard problems, our trained RL agent significantly outperforms expert-designed branching rules and the state-of-the-art learning-based branching methods in terms of both speed and effectiveness. Our results suggest that with carefully designed policy networks and learning algorithms, reinforcement learning has the potential to advance algorithms for solving MIPs.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Supplementary Material: zip

Reviewed Version (pdf): https://openreview.net/references/pdf?id=Pco3pGL8sn

13 Replies

Loading