Belief Propagation of Pareto Front in Multi-Objective MDP Graphs

Published: 01 Jan 2023, Last Modified: 19 Jul 2024MLSP 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: In the context of Markov Decision Processes (MDPs), the framework of forward-backward probability propagation on factor graphs has proven to be useful for finding optimal policies. However, in cases involving vector rewards, there is a need to evaluate a trade-off among constituent objectives. In this work, assuming multiple rewards, we show how to use the framework of belief propagation for dynamically generating the Pareto front and propagating it as a forward flow distribution. The idea is applied to path planning on discrete 1D and 2D grids where different sets of states have vector rewards in the form of priors.
Loading