JaxPlan and GurobiPlan: Optimization Baselines for Replanning in Discrete and Mixed Discrete and Continuous Probabilistic Domains
Keywords: Probabilistic planning, replanning, RDDL, planning-by-backpropagation, mixed discrete-continuous MDP, mixed-integer nonlinear programming, performance evaluation
TL;DR: We enhance existing probabilistic planning tools, namely planning-by-backpropagation and mixed-integer programming, to mixed discrete-continuous MDPs with nonlinear dynamics, and evaluate their performance against winners on IPC 2011/14/23.
Abstract: Replanning methods that determinize a stochastic planning problem and replan at each action step have long been known to provide strong baseline (and even competition winning) solutions to discrete probabilistic planning problems. Recent work has explored the extension of replanning methods to the case of mixed discrete and continuous probabilistic domains by leveraging MILP compilations of the RDDL specification language. Other recent advances in probabilistic planning have explored the compilation of structured mixed discrete and continuous RDDL domains into a determinized computation graph that also lends itself to replanning via so-called planning by backpropagation methods. However, to date, there has not been any comprehensive comparison of these recent optimization-based replanning methodologies to the state-of-the-art winner of the discrete probabilistic IPC 2011 and 2014 and runner-up in 2018 (PROST) and the winner of the mixed discrete-continuous probabilistic IPC 2023 (DiSProd). In this paper, we provide JaxPlan that has several extensive upgrades to both planning by backpropagation and its compact tensorized compilation from RDDL to a Jax computation graph with discrete relaxations and a sample average approximation. We also provide the first detailed overview of a compilation of the RDDL language specification to Gurobi's Mixed Integer Nonlinear Programming (MINLP) solver that we term GurobiPlan. We provide a comprehensive comparative analysis of JaxPlan and GurobiPlan with competition winning planners on 19 domains and a total of 155 instances to assess their performance across (a) different domains, (b) different instance sizes, and (c) different time budgets. We also release all code to reproduce the results along with the open-source planners we describe in this work.
Category: Long
Student: No
Supplemtary Material: pdf
Submission Number: 326
Loading