Differentially Private Deep Model-Based Reinforcement Learning

Alexandre Rio; Merwan Barlier; Igor Colin; Albert Thomas

Differentially Private Deep Model-Based Reinforcement Learning

Alexandre Rio, Merwan Barlier, Igor Colin, Albert Thomas

27 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: machine learning, reinforcement learning, privacy, differential privacy, deep learning, model-based, offline

TL;DR: We address deep offline reinforcement learning with differential privacy guarantees, using a model-based approach.

Abstract: We address private deep offline reinforcement learning (RL), where the goal is to train a policy on standard control tasks that is differentially private (DP) with respect to individual trajectories in the dataset. To achieve this, we introduce PriMORL, a model-based RL algorithm with formal differential privacy guarantees. PriMORL first learns an ensemble of trajectory-level DP models of the environment from offline data. It then optimizes a policy on the penalized private model, without any further interaction with the system or access to the dataset. In addition to offering strong theoretical guarantees, we empirically demonstrate that PriMORL enables the training of private RL agents on offline continuous control tasks with deep function approximations, whereas current methods are limited to simpler tabular and linear Markov Decision Processes (MDPs). We furthermore outline the trade-offs involved in achieving privacy in this setting.

Supplementary Material: zip

Primary Area: reinforcement learning

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 10496

Loading