Model Extraction Attacks on Split Federated LearningDownload PDF

16 May 2022 (modified: 05 May 2023)NeurIPS 2022 SubmittedReaders: Everyone
Keywords: Split Federated Learning, Model Extraction Attack
TL;DR: We first investigate and propose five effective model extraction attacks on Split Federated Learning
Abstract: Federated learning (FL) is a popular collaborative learning scheme involving multiple clients and a server. FL focuses on client's data privacy but exposes interfaces for Model Extraction (ME) attacks. As FL periodically collects and shares model parameters, a malicious client can download the latest model and thus steal model Intellectual Property (IP). Split Federated Learning (SFL), a recent variant of FL, splits the model into two, giving one part of the model (client-side model) to clients, and the remaining part (server-side model) to the server. While SFL was primarily designed to facilitate training on resource-constrained devices, it prevents some ME attacks by blocking prediction queries. In this work, we expose the vulnerability of SFL and show how ME attacks can be launched by malicious clients querying the gradient information from server-side. We propose five ME attacks that differ in the gradient usage in data crafting, generating, gradient matching and soft-label crafting as well as in the attacker data availability assumptions. We show that the proposed ME attacks work exceptionally well for SFL. For instance, when the server-side model has five layers, our proposed ME attack can achieve over 90% accuracy with less than 2% accuracy degradation with VGG-11 on CIFAR-10.
Supplementary Material: zip
20 Replies