Secure Feature Selection for Vertical Federated Learning in eHealth SystemsDownload PDFOpen Website

2022 (modified: 24 Apr 2023)ICC 2022Readers: Everyone
Abstract: Privacy-preserving vertical federated learning (VFL) has been widely applied in electronic health (eHealth) systems. However, existing VFL schemes rarely consider the data pre-processing step including feature selection, which will lead to poor convergence rate and even damaging the model utility. In this paper, we propose an efficient and privacy-preserving feature selection scheme for VFL. Specifically, we first propose a general Gini-impurity based feature selection framework, which is compatible with most existing machine learning models in VFL. With the framework, we present two concrete protocols (dubbed π <inf xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">SS−FS</inf> and π <inf xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">H−FS</inf> , respectively) customized for different eHealth scenarios. π <inf xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">SS−FS</inf> exploits a lightweight additive secret sharing technique, such that it can be executed in comparable time as the evaluation of the plaintext scheme. π <inf xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">H−FS</inf> is a hybrid feature selection protocol that additionally utilizes a linear homomorphic encryption technique, to reduce the communication overhead at the cost of a moderate runtime. Moreover, extensive evaluations conducted on real-world medical datasets demonstrate that our scheme realizes up to 27% accuracy gains.
0 Replies

Loading