Abstract: Federated Learning (FL) is a distributed machine learning framework, where any raw data do not leave the participating clients' machines aiming for privacy preservation. Due to its distributed nature, federated learning is especially vulnerable to data poisoning attacks which degrade overall performance of the framework. Hence there is an arising need of early identification and removal of malicious clients. However, correctly identifying malicious clients is difficult. Clients with non-IID (Independently and Identically Distributed) data and those with malicious data, for example, are hardly distinguishable due to the dissimilar distribution of non-IID data and normal data. Prior works focus on improving the performance with either non-IID data or malicious data, but not both. On the other hand, this paper proposes a mechanism that identifies and classifies three types of clients: clients having IID, non-IID, and malicious data. Our findings can help future studies to remove malicious clients efficiently while training a model with diverse data.
Loading