Differentially Private Reinforcement Learning

Pingchuan Ma, Zhiqiang Wang, Le Zhang, Ruming Wang, Xiaoxiang Zou, Tao Yang

Published: 2019, Last Modified: 13 May 2023ICICS 2019Readers: Everyone

Abstract: With remarkable performance and extensive applications, reinforcement learning is becoming one of the most popular learning techniques. Often, the policy $$\pi ^*$$ released by reinforcement learning model may contain sensitive information, and an adversary can infer demographic information through observing the output of the environment. In this paper, we formulate differential privacy in reinforcement learning contexts, design mechanisms for $$\epsilon $$ -greedy and Softmax in the K-armed bandit problem to achieve differentially private guarantees. Our implementation and experiments illustrate that the output policies are under good privacy guarantees with a tolerable utility cost.

0 Replies