An Attack-Defense Game-Based Reinforcement Learning Privacy-Preserving Method Against Inference Attack in Double Auction Market

Donghe Li, Chunlin Hu, Qingyu Yang, Yuhao Ma, Feiye Zhang, Dou An

Published: 03 Nov 2024, Last Modified: 19 Feb 2025IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERINGEveryoneCC BY-ND 4.0

Abstract: Auction mechanism, as a fair and efficient resource allocation method, has been widely used in varieties trading scenarios, such as advertising, crowdsensoring and spectrum. However, in addition to obtaining higher profits and satisfaction, the privacy concerns have attracted researchers’ attention. In this paper, we mainly study the privacy preserving issue in the double auction market against the indirect inference attack. Most of the existing works apply differential privacy theory to defend against the inference attack, but there exists two problems. First, ‘indistinguishability’ of differential privacy (DP) cannot prevent the disclosure of continuous valuations in the auction market. Second, the privacy-utility trade-off (PUT) in differential privacy deployment has not been resolved. To this end, we proposed an attack-defense game-based reinforcement learning privacy preserving method to provide practically privacy protection in double auction. First, the auctioneer acts as defender, adds noise to the bidders’ valuations, and then acts as adversary to launch inference attack. After that the auctioneer uses the attack results and auction results as a reference to guide the next deployment. The above process can be regarded as a Markov Decision Process (MDP). The state is the valuations of each bidders under the current steps. The action is the noise added to each bidders. The reward is composed of privacy, utility and training speed, in which attack success rate and social welfare are taken as measures of privacy and utility, a delay penalty term is used to reduce the training time. Utilizing the deep deterministic policy gradient (DDPG) algorithm, we establish an actor-critic network to solve the problem of MDP. Finally, we conducted extensive evaluations to verify the performance of our proposed method. The results show that compared with other existing DP-based double auction privacy preserving mechanisms, our method can achieve better results in both privacy and utility. We can reduce the attack success rate from nearly 100% to less than 20%, and the utility deviation is less than 5%.