Improving self-supervised vertical federated learning with contrastive instance-wise similarity and dynamical balance pool

Shuai Chen; Wenyu Zhang; Xiaoling Huang; Cheng Zhang; Qingjun Mao

Improving self-supervised vertical federated learning with contrastive instance-wise similarity and dynamical balance pool

Shuai Chen, Wenyu Zhang, Xiaoling Huang, Cheng Zhang, Qingjun Mao

Published: 01 Jan 2025, Last Modified: 01 Aug 2025Future Gener. Comput. Syst. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Vertical Federated Learning (VFL) enables multiple parties with distinct feature spaces to train a joint VFL model collaboratively without exposing their original private data. In realistic scenarios, the scarcity of aligned and labeled samples among collaborating participants limits the effectiveness of traditional VFL approaches for model training. Current VFL frameworks attempt to leverage abundant unlabeled data using Contrastive Self-Supervised Learning (CSSL). However, the simplistic incorporation of CSSL methods cannot address severe domain shift in VFL. In addition, CSSL methods typically conflict with general regularization approaches designed to alleviate domain shift, thereby significantly limiting the potential of the self-supervised learning framework in VFL. To address these challenges, this study proposes an Improved Self-Supervised Vertical Federated Learning (ISSVFL) framework for VFL in label-scarce scenarios under the semi-honest and no-collusion assumption. ISSVFL merges CSSL with instance-wise similarity to resolve regularization conflicts and captures more significant inter-domain knowledge in the representations from different participants, effectively alleviating domain shift. In addition, a new dynamical balance pool is proposed to fine-tune the pre-trained models for downstream supervised tasks by dynamically balancing inter-domain and intra-domain knowledge. Extensive empirical experiments on image and tabular datasets demonstrate that ISSVFL achieves an average performance improvement of 3.3 % compared with state-of-the-art baselines.

Loading