Interpreting Memorization in Deep Learning from Data Distribution

Likun Zhang, Jingwei Sun, Shoukun Guo, Fenghua Li, Jin Cao, Ben Niu

Published: 01 Jan 2024, Last Modified: 09 Aug 2024ICASSP 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: A deep learning model can be vulnerable to a membership inference attack (MIA) which allows an attacker to determine if a specific data record was used for its training. In this paper, we investigate the unfairness of disparate vulnerability to MIA across different subgroups in terms of their data distributions. We propose three practical methods to characterize the distribution of complex training data for deep learning models, which are validated to be effective in identifying the vulnerable data records. We then provide a theoretical definition for MIA vulnerability. Experimental results demonstrate the impact of data distribution on disparate vulnerability, where the out-of-distribution outliers are much more easily attacked than normal data records. Even if the accuracy of MIA looks no better than random guessing over the whole population, there are certain groups of "outliers" can be significantly more vulnerable than others. For example, the attack accuracy on examples with the largest 10% outlierness is 15% higher than that on in-distribution examples.