A Markov Decision Process for Variable Selection in Branch & Bound

Paul STRANG; Zacharie ALES; Côme Bissuel; Safia Kedad-Sidhoum; Olivier Juan; Emmanuel Rachelson

A Markov Decision Process for Variable Selection in Branch & Bound

Paul STRANG, Zacharie ALES, Côme Bissuel, Safia Kedad-Sidhoum, Olivier Juan, Emmanuel Rachelson

Published: 17 Jul 2025, Last Modified: 07 Oct 2025EWRL 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Mixed-integer linear programming, Branch and bound, Reinforcement learning, Markov decision process

TL;DR: We enhance the potential for reinforcement learning applications in mixed-integer linear programming by modeling variable selection in Branch and Bound as a Markov decision process.

Abstract: Mixed-Integer Linear Programming (MILP) is a powerful framework used to address a wide range of NP-hard combinatorial optimization problems, often solved by Branch and bound (B&B). A key factor influencing the performance of B&B solvers is the variable selection heuristic governing branching decisions. Recent contributions have sought to adapt reinforcement learning (RL) algorithms to the B&B setting to learn optimal branching policies, through Markov Decision Processes (MDP) inspired formulations, and ad hoc convergence theorems and algorithms. In this work, we introduce B&B MDPs, a principled vanilla MDP formulation for variable selection in B&B, allowing to leverage a broad range of RL algorithms for the purpose of learning optimal B&B heuristics. Computational experiments validate our model empirically, as our branching agent outperforms prior state-of-the-art RL agents on four standard MILP benchmarks.

Confirmation: I understand that authors of each paper submitted to EWRL may be asked to review 2-3 other submissions to EWRL.

Serve As Reviewer: ~Paul_STRANG2

Track: Regular Track: unpublished work

Submission Number: 32

Loading