A Comprehensive Study of Privacy Risks in Curriculum Learning

15 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Supplementary Material: pdf
Primary Area: societal considerations including fairness, safety, privacy
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: curriculum Learning, membership inference attack, attribute inference attack
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
TL;DR: We study the privacy risks introduced by curriculum learning through the lens of membership inference attack (MIA) and attribute inference attack (AIA)
Abstract: Curriculum learning (CL) is a machine learning technique that progressively trains a model on data of increasing difficulty or complexity. This way, the model can learn more efficiently and achieve better performance than random or uniform sampling of data. However, most existing works focus on improving the performance of CL and its privacy risks have never been studied. In this work, we take the first step to investigate the privacy leakage of CL through the lens of membership inference attack (MIA) and attribute inference attack (AIA). Our evaluation of 9 benchmark datasets using various attack methods (NN-based, metric-based, label-only MIA, and NN-based AIA) highlights new insights. First, MIA is slightly more effective with CL, especially on a subset of challenging training samples. Second, models trained with CL are less susceptible to AIA compared to MIA. Third, established defense techniques like DP-SGD, MemGuard, and MixupMMD remain effective under CL, albeit with a notable accuracy impact for DP-SGD. Lastly, we propose a novel MIA called Diff-Cali, which leverages difficulty scores to enhance calibration and effectiveness against all CL and normal training methods. With this study, we hope to draw the community's attention to the unintended privacy risks of emerging machine-learning techniques and develop new attack benchmarks and defense solutions.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 421
Loading