Divide and Conquer: Provably Unveiling the Pareto Front with Multi-Objective Reinforcement Learning

21 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: reinforcement learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: Multi-objective, Reinforcement learning, Pareto front
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
TL;DR: We propose a new algorithm that provably learns a Pareto front using a divide and conquer approach.
Abstract: We introduce a novel algorithm for learning the Pareto front in multi-objective Markov decision processes. Our algorithm decomposes learning the Pareto front into a sequence of single-objective problems, each of which is solved by an oracle and leads to a non-dominated solution. We propose a procedure to select the single-objective problems such that each iteration monotonically decreases the objective space that possibly still contains Pareto optimal solutions. The final algorithm is proven to converge to the Pareto front and provides an upper bound on the distance to undiscovered non-dominated policies in each iteration. We introduce several practical designs of the required oracle by extending single-objective reinforcement learning algorithms. When evaluating our algorithm with these oracles on benchmark environments, we find that it leads to a close approximation of the true Pareto front. By leveraging problem-specific single-objective solvers, our approach holds promise for applications beyond multi-objective reinforcement learning, such as in pathfinding and optimisation.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
Supplementary Material: zip
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 3694
Loading