Abstract: Recent studies show that privacy leakages may occur in vertical federated learning (VFL), where parties hold split features of the same samples. While various attacks, including label and feature inference, focus on record-level privacy risks in VFL, few studies delve into the distribution-level privacy threat. In this paper, we explore property inference attacks (PIAs) in VFL, where an adversarial party seeks to deduce global distribution information about a target property in the victim party’s training set. Our key observation is that the $L_{p}$ -norm distribution of intermediate results in VFL could reflect the fraction of the target property in a training set. Inspired by this, we present ProVFL, a novel PIA framework involving distribution comparison and correlation augmentation modules. To achieve property inference, we design a distribution comparison module by creating various intermediate-result populations with different proportions, aiming to learn the relationship between $L_{p}$ -norm distributions and their fractions. Then, we theoretically analyze the factors that contribute to the attack effectiveness and develop a correlation augmentation module based on label replacement and model refinement to amplify property information leakage. Extensive experimental results demonstrate that our attacks can achieve inferences with low estimation errors as low as 1%. This poses the immediate threat of property information leakage from private training data in the VFL setting.
Loading