Differentially Private Reinforcement LearningOpen Website

Published: 01 Jan 2019, Last Modified: 13 May 2023ICICS 2019Readers: Everyone
Abstract: With remarkable performance and extensive applications, reinforcement learning is becoming one of the most popular learning techniques. Often, the policy $$\pi ^*$$ released by reinforcement learning model may contain sensitive information, and an adversary can infer demographic information through observing the output of the environment. In this paper, we formulate differential privacy in reinforcement learning contexts, design mechanisms for $$\epsilon $$ -greedy and Softmax in the K-armed bandit problem to achieve differentially private guarantees. Our implementation and experiments illustrate that the output policies are under good privacy guarantees with a tolerable utility cost.
0 Replies

Loading