OSLO: One-Shot Label-Only Membership Inference Attacks

Published: 25 Sept 2024, Last Modified: 29 Sept 2024NeurIPS ‘24EveryoneRevisionsCC BY 4.0
Abstract: We introduce One-Shot Label-Only (OSLO) membership inference attacks (MIAs), which accurately infer a given sample’s membership in a target model’s training set with high precision using just a single query, where the target model only returns the predicted hard label. This is in contrast to state-of-the-art label-only attacks which require ∼ 6000 queries, yet get attack precisions lower than OSLO’s. OSLO leverages transfer-based black-box adversarial attacks. The core idea is that a member sample exhibits more resistance to adversarial perturbations than a non-member. We compare OSLO against state-of-the-art label-only attacks and demonstrate that, despite requiring only one query, our method significantly outperforms previous attacks in terms of precision and true positive rate (TPR) under the same false positive rates (FPR). For example, compared to previous label-only MIAs, OSLO achieves a TPR that is 7× to 28× stronger under a 0.1% FPR on CIFAR10 for a ResNet model. We evaluated multiple defense mechanisms against OSLO.
Loading