Feedback-Guided Black-box Attack in Federated Learning: A Cautious Attacker Perspective

Feedback-Guided Black-box Attack in Federated Learning: A Cautious Attacker Perspective

TMLR Paper3896 Authors

08 Jan 2025 (modified: 06 Aug 2025)Rejected by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: Federated Learning (FL) is a robust approach to collaborative machine learning that upholds the integrity of data privacy by ensuring that data remains with the owners. However, FL systems are vulnerable to sophisticated adversarial attacks from malicious clients, especially those leveraging black-box settings. Unlike centralized data poisoning, attacking FL presents unique challenges (i) server-side defense mechanisms can detect and discard suspicious client updates, requiring attacks to maintain minimal visibility across multiple training rounds, and (ii) malicious clients must repeatedly generate poisoned data using only their local black-box model for each round of training, as previous poisoning attempts may be nullified during global aggregation. This forces adversaries to craft stealthy poisoned data locally in a black-box context for each round, maintaining low visibility while ensuring impact. Existing FL attack methods often show high visibility while maintaining impact due to their attack nature, the scale of the introduced perturbations, and the lack of detection strategies. Also, these methods often rely on maximizing cross-entropy loss on the true class, resulting in delayed attack convergence and highly noticeable perturbations. Hence, it is crucial to develop a stealthy data poisoning attack with low visibility for black-box settings in order to comprehend the motives of a cautious attacker in designing an FL attack. To address these challenges, we propose a Feedback-guided Causative Image Black-box Attack (F-CimBA), which is specifically designed for FL by adding random perturbation noise to the data. F-CimBA minimizes the loss of the most confused class (i.e., the incorrect class that the model confuses with the highest probability) instead of the true class, allowing it to exploit local model vulnerabilities for early attack convergence. This approach ensures that poisoned updates maintain low visibility, reducing the likelihood of server-side rejection. Furthermore, F-CimBA adapts effectively under non-IID data distributions and varying attack scenarios, consistently degrading the global model's performance. Additionally, we analyze its impact on system hardware metrics, highlighting the stealth and efficiency of F-CimBA, considering the computational overhead of repeated poisoning attempts in the FL context. Our evaluation demonstrates F-CimBA's consistent ability to poison the global model with minimal visibility under varying attack scenarios and non-IID data distributions, even in the presence of robust server-side defenses.

Submission Length: Long submission (more than 12 pages of main content)

Assigned Action Editor: ~Aurélien_Bellet1

Submission Number: 3896

Loading