Combining MORL with Restraining Bolts to Learn Normative Behaviour

Published: 15 Jun 2025, Last Modified: 07 Aug 2025AIA 2025EveryoneRevisionsBibTeXCC BY 4.0
Keywords: restraining bolts, normative reasoning, multi-objective reinforcement learning, LTLf
Abstract: Normative Restraining Bolts (NRBs) adapt the restraining bolt technique (originally developed for safe reinforcement learning) to ensure compliance with social, legal, and ethical norms. While effective, NRBs rely on trial-and-error weight tuning, which hinders their ability to enforce hierarchical norms; moreover, norm updates require retraining. In this paper, we reformulate learning with NRBs as a multi-objective reinforcement learning (MORL) problem, where each norm is treated as a distinct objective. This enables the introduction of Ordered Normative Restraining Bolts (ONRBs), which support algorithmic weight selection, prioritized norms, norm updates, and provide formal guarantees on minimizing norm violations. Case studies show that ONRBs offer a robust and principled foundation for RL-agents to comply with a wide range of norms while achieving their goals.
Paper Type: Previously Published Paper
Venue For Previously Published Paper: IJCAI 2025
Submission Number: 15
Loading