Towards Trustworthy Federated Learning with Untrusted Participants

Youssef Allouah; Rachid Guerraoui; John Stephan

Towards Trustworthy Federated Learning with Untrusted Participants

Youssef Allouah, Rachid Guerraoui, John Stephan

Published: 01 May 2025, Last Modified: 23 Jul 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Resilience against malicious participants and data privacy are essential for trustworthy federated learning, yet achieving both with good utility typically requires the strong assumption of a trusted central server. This paper shows that a significantly weaker assumption suffices: each pair of participants shares a randomness seed unknown to others. In a setting where malicious participants may collude with an untrusted server, we propose CafCor, an algorithm that integrates robust gradient aggregation with correlated noise injection, using shared randomness between participants. We prove that CafCor achieves strong privacy-utility trade-offs, significantly outperforming local differential privacy (DP) methods, which do not make any trust assumption, while approaching central DP utility, where the server is fully trusted. Empirical results on standard benchmarks validate CafCor's practicality, showing that privacy and robustness can coexist in distributed systems without sacrificing utility or trusting the server.

Lay Summary: Federated learning lets many users train a machine learning model together without sharing their personal data, helping to protect privacy. However, most current methods rely on a central server that must be trusted to act honestly and keep data secure. Our research introduces a new method called CafCor that removes the need to trust this central server. Instead, it uses a small amount of shared random information that participants exchange privately with each other before training—information that remains hidden from the server. This shared randomness protects user privacy, though it can make it harder to guard against dishonest participants. CafCor carefully balances these trade-offs, achieving better privacy and performance than existing approaches that don’t assume any trust, and approaching the effectiveness of those that trust all participants and the server. Our experiments show that it's possible to train models securely and reliably, even when the server is not trusted.

Primary Area: Social Aspects->Privacy

Keywords: distributed machine learning, differential privacy, robustness, federated learning, trustworthy machine learning

Flagged For Ethics Review: true

Submission Number: 677

Loading