Secure Interaction-Based Feature Selection for Vertical Federated Learning

Published: 01 Jan 2024, Last Modified: 17 Feb 2025ICC 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Federated learning enables decentralized data own-ers to collaborate and train models in a distributed manner. A special type is Vertical Federated Learning (VFL), where each of the participated data owners only has a portion of the data features. To maintain a high accuracy and reasonable computational cost, selecting a set of features among the entire dataset is essential. Although some existing work selects features by calculating their individual contributions to the learning outcomes, knowing the joint contribution from multiple features becomes necessary but challenging. Meanwhile, security concerns are raised when calculating the joint contribution of a set of features where the feature data are stored by different owners. Using homomorphic encryption or secure computing over en-crypted data is possible, but it may cost too much when complex calculations are involved and repeated. To this end, this paper proposes a privacy-preserving feature selection protocol that considers the interactions between features stored across different data owners. Specifically, we first propose an interaction-based feature selection algorithm for vertically distributed datasets. This algorithm estimates the features' joint contributions to the model training outcomes. Then, we propose a privacy-preservation protocol to prevent the semi-honest cloud server from obtaining or inferring the raw data when aggregating the knowledge and calculating the complex interaction measure for feature selection. We create a new approximation method for interaction measures to address the high computational cost when securely calculating the interaction measure while maintaining the training accuracy. The security discussions show that the proposed protocol preserves data owner's privacy. The extensive simulations validate the achieved training accuracy and efficiency.
Loading