Abstract: Many real-world coordination tasks---such as environmental monitoring, traffic management, and underwater exploration---are best modelled as multiagent problems with multiple, often conflicting objectives.
Achieving effective coordination in these settings requires addressing two main challenges: 1) balancing multiple objectives and 2) resolving the credit assignment problem to isolate each agent’s contribution from team-level feedback.
Existing multiagent credit assignment methods collapse multi-objective reward vectors into a single scalar---potentially overlooking nuanced trade-offs.
In this paper, we introduce the Multi-Objective Difference Evaluation ($D_{MO}$) operator to assign agent-level credit without a priori scalarisation.
$D_{MO}$ measures the change in hypervolume when an agent’s policy is replaced by a counterfactual default, capturing how much that policy contributes to each objective and to the Pareto front.
We embed $D_{MO}$ into the popular NSGA-II algorithm to evolve a population of joint policies with distinct trade-offs.
Empirical results on the Multi-Objective Beach Problem and the Multi-Objective Rover Exploration domain show that our approach matches or surpasses existing baselines, delivering up to a 33\% performance improvement.
Loading