- Abstract: Convolutional neural networks memorize part of their training data, which is why strategies such as data augmentation and drop-out are employed to mitigate over- fitting. This paper considers the related question of “membership inference”, where the goal is to determine if an image was used during training. We con- sider membership tests over either ensembles of samples or individual samples. First, we show how to detect if a dataset was used to train a model, and in particular whether some validation images were used at train time. Then, we introduce a new approach to infer membership when a few of the top layers are not available or have been fine-tuned, and show that lower layers still carry information about the training samples. To support our findings, we conduct large-scale experiments on Imagenet and subsets of YFCC-100M with modern architectures such as VGG and Resnet.
- Keywords: membership inference, memorization, attack, privacy
- TL;DR: We analyze the memorization properties by a convnet of the training set and propose several use-cases where we can extract some information about the training set.