On the Pitfalls of Security Evaluation of Robust Federated Learning

Momin Ahmad Khan, Virat Shejwalkar, Amir Houmansadr, Fatima M. Anwar

Published: 01 Jan 2023, Last Modified: 06 Nov 2023SP (Workshops) 2023Readers: Everyone

Abstract: Prior literature has demonstrated that Federated learning (FL) is vulnerable to poisoning attacks that aim to jeopardize FL performance, and consequently, has introduced numerous defenses and demonstrated their robustness in various FL settings. In this work, we closely investigate a largely over-looked aspect in the robust FL literature, i.e., the experimental setup used to evaluate the robustness of FL poisoning defenses. We thoroughly review 50 defense works and highlight several questionable trends in the experimental setup of FL poisoning defense papers; we discuss the potential repercussions of such experimental setups on the key conclusions made by these works about the robustness of the proposed defenses. As a representative case study, we also evaluate a recent poisoning recovery paper from IEEE S&P'23, called FedRecover. Our case study demonstrates the importance of the experimental setup decisions (e.g., selecting representative and challenging datasets) in the validity of the robustness claims; For instance, while FedRecover performs well for MNIST and FashionMNIST (used in the original paper), in our experiments it performed poorly for FEMNIST and CIFAR10.

0 Replies