Learning the Value Systems of Societies with Preference-based Multi-objective Reinforcement Learning

Published: 19 Dec 2025, Last Modified: 27 Dec 2025AAMAS 2026 FullEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Value Awareness, Value Alignment, Preference-based Reinforcement Learning, Inverse Reinforcement Learning, Multi-objective Reinforcement Learning, COINE
Abstract: Value-aware AI should recognize human values and adapt to the value systems (value-based preferences) of different users. This requires operationalization of values, which can be prone to misspecification. The social nature of values demands their representation to adhere to multiple users while value systems are diverse, yet exhibit patterns among groups. In sequential decision making, efforts have been made towards personalization for different goals or values from demonstrations of diverse agents. However, these approaches demand manually designed features or lack value-based interpretability and/or adaptability to diverse user preferences. We propose algorithms for learning models of value alignment and value systems for a society of agents in Markov Decision Processes (MDPs), based on user clustering and preference-based multi-objective reinforcement learning (PbMORL). We jointly learn socially-derived value alignment models (groundings) and a set of value systems to represent different groups of users (clusters) in a society in a concise manner. Each cluster consists of a value system representing the value-based preferences of its members and a policy that reflects behaviours aligned with this value system. We evaluate our approach against a state-of-the-art PbMORL algorithm and baselines on two MDPs with human values. Our approach learns a representative value system of the society and induces policies that approximate in a concise manner Pareto efficient behaviours regarding a simulated grounding. Our method exhibits advantages against algorithms that require more complex human feedback or manual reward design to achieve value-awareness.
Area: Coordination, Organisations, Institutions, Norms and Ethics (COINE)
Generative A I: I acknowledge that I have read and will follow this policy.
Submission Number: 245
Loading