Abstract: Deep Reinforcement Learning (DRL) combines the perceptual capabilities of deep learning with the decision-making capabilities of Reinforcement Learning RL, which can achieve enhanced decision-making. However, the environmental state data contains the privacy of the users. There exists consequently a potential risk of environmental state information being leaked during RL training. Some data desensitization and anonymization technologies are currently being used to protect data privacy. There may still be a risk of privacy disclosure with these desensitization techniques. Meanwhile, policymakers need the environmental state to make decisions, which will cause the disclosure of raw environmental data. To address the privacy issues in DRL, we propose a differential privacy-based online DRL algorithm. The algorithm will add Gaussian noise to the gradients of the deep network according to the privacy budget. More important, we prove tighter bounds for the privacy budget. Furthermore, we train an autocoder to protect the raw environmental state data. In this work, we prove the privacy budget formulation for differential privacy-based online deep RL. Experiments show that the proposed algorithm can improve privacy protection while still having relatively excellent decisionmaking performance.
0 Replies
Loading