Designing ethical environments using multi-agent reinforcement learning

Published: 01 Apr 2025, Last Modified: 02 May 2025ALAEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Ethical values, learning for value alignment, multi-agent reinforcement learning
TL;DR: Presenting algorithm to create ethical environments where agents learn to be ethical
Abstract: This paper introduces the Approximate Ethical Embedding Process, an algorithm for automating the design of ethical environments for learning agents. Our algorithm helps build environments wherein multiple agents learn policies that align with an ethical (moral) value while simultaneously pursuing their individual objectives. Therefore, we contribute to endowing environment designers with algorithmic tools for building ethical environments. Moreover, we demonstrate the ethical design process for two different settings of a gathering environment, where agents have to adhere to beneficence to promote the collective survival of the population. Our experiments show that our approximate embedding process successfully generates environments that incentivise the learning of value-aligned policies.
Supplementary Material: zip
Type Of Paper: Full paper (max page 8)
Anonymous Submission: Anonymized submission.
Submission Number: 1
Loading