BiFD: A Bidirectional Feature Discrepancy Defense against Hijacking Attack in Split Learning

Published: 01 Jan 2025, Last Modified: 10 Nov 2025ICME 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Split Learning (SL) is a widely adopted distributed privacy-preserving training paradigm with minimal computational overhead for clients. However, Feature-Space Hijacking Attack (FSHA) poses a significant threat against SL, where the server manipulates the client's optimization process, compromising input privacy. Some studies propose that clients can detect potential hijacking by monitoring the gradients returned by the server. However, these gradient-based methods are vulnerable to adversarial anti-detection and lack robustness to changes in model architecture. In this paper, we propose a novel detection method named Bidirectional Feature Discrepancy Defense (BiFD), which leverages features to capture richer semantic information. We also observe that hijacked features are easier to reconstruct and harder to classify, providing a key distinction between malicious and honest servers—an aspect overlooked in previous works. Extensive results across multiple datasets and model architectures demonstrate the excellent and robust performance of BiFD.
Loading