Multi-objective Reinforcement Learning in Factored MDPs with Graph Neural Networks

Marc Vincent, Amal El Fallah Seghrouchni, Vincent Corruble, Narayan Bernardin, Rami Kassab, Frédéric Barbaresco

Published: 01 Jan 2023, Last Modified: 10 Oct 2024AAMAS 2023EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Many potential applications of reinforcement learning involve complex, structured environments. Some of these problems can be analyzed as factored MDPs, where the dynamics are decomposed into locally independent state transitions and the reward is rewritten as the sum of local rewards. However, in some scenarios, these rewards may represent conflicting objectives, so that the problem is better interpreted as a multi-objective one, with a weight associated to each reward. To deal with such multi-objective factored MDPs, we propose a method which combines the use of graph neural networks, to process structured representations, and vector-valued Q-learning. We show that our approach empirically outperforms methods that directly learn from the scalarized reward and demonstrate its ability to generalize to different weights and number of entities.