Reducing Forgetting in Federated Learning with Truncated Cross-Entropy

Gwen Legate; Lucas Caccia; Eugene Belilovsky

Reducing Forgetting in Federated Learning with Truncated Cross-Entropy

Gwen Legate, Lucas Caccia, Eugene Belilovsky

Published: 21 Oct 2022, Last Modified: 05 May 2023NeurIPS 2022 Workshop DistShift PosterReaders: Everyone

Keywords: catastrophic forgetting, client drift, federated learning, heterogenous data

TL;DR: In federated learning local models experience catastrophic forgetting with respect to other clients data, we use truncated cross entropy

Abstract: In federated learning (FL), a global model is learned by aggregating model updates computed from a set of client nodes, each having their own data. A key challenge in FL is the heterogeneity of data across clients whose data distributions differ from one another. Standard FL algorithms perform multiple gradient steps before synchronizing the model, which can lead to clients overly minimizing their local objective and diverging from other client solutions. We demonstrate that in such a setting individual client models experience ``catastrophic forgetting" with respect to other client data. We propose a simple yet efficient approach that modifies the cross-entropy objective on a per-client basis such that classes outside a client's label set are shielded from abrupt representation change. Through empirical evaluations, we demonstrate our approach can alleviate this problem, especially under the most challenging FL settings with high heterogeneity, low client participation.

1 Reply

Loading