Differentially Private Deep Model-Based Reinforcement Learning

Alexandre Rio; Merwan Barlier; Igor Colin; Albert Thomas

Differentially Private Deep Model-Based Reinforcement Learning

Alexandre Rio, Merwan Barlier, Igor Colin, Albert Thomas

Published: 01 Aug 2024, Last Modified: 09 Oct 2024EWRL17EveryoneRevisionsBibTeXCC BY 4.0

Keywords: machine learning, reinforcement learning, privacy, differential privacy, deep learning, model-based, offline

TL;DR: We address deep offline reinforcement learning with differential privacy guarantees, using a model-based approach.

Abstract: We address deep offline reinforcement learning with privacy guarantees, where the goal is to train a policy that is differentially private with respect to individual trajectories in the dataset. To achieve this, we introduce DP-MORL, an MBRL algorithm with differential privacy guarantees. A private model of the environment is first learned from offline data using DP-FedAvg, a training method for neural networks that provides differential privacy guarantees at the trajectory level. Then, we use model-based policy optimization to derive a policy from the (penalized) private model, without any further interaction with the system or access to the dataset. We empirically show that DP-MORL enables the training of private RL agents from offline data in continuous control tasks and we furthermore outline the price of privacy in this setting.

Submission Number: 77

Loading