Adversarial Machine Unlearning: A Stackelberg Game Approach

23 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: societal considerations including fairness, safety, privacy
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: machine unlearning, game theory, privacy, membership inference attack, adversarial, stackelberg game
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Abstract: This paper focuses on the challenge of machine unlearning, aiming to remove the influence of specific training data on machine learning models. Traditionally, the development of unlearning algorithms runs parallel with that of membership inference attacks, a type of privacy threat to determine whether a data instance was used for training. Recognizing this interplay, we propose a game-theoretic framework that integrates the attacks into the design of unlearning algorithms. We model the unlearning problem as a Stackelberg game, introducing a two-player dynamic: a defender striving to unlearn specific training data from a model, and an attacker employing membership inference attacks to detect the traces of the data. Adopting this adversarial perspective allows the utilization of new attack advancements, facilitating the design of unlearning algorithms. Our framework stands out in two ways. First, it enables the exact implementation of advanced membership inference attacks, providing verification for the effectiveness of unlearning. Second, it enables differentiation through optimization problems of attacks, making the framework readily integrable into end-to-end learning pipelines. We present extensive experimental results to validate the efficacy of the proposed framework.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 6620
Loading