Sovereign Federated Learning with Byzantine-Resilient Aggregation

Sovereign Federated Learning with Byzantine-Resilient Aggregation

TMLR Paper6759 Authors

02 Dec 2025 (modified: 21 Jan 2026)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: The concentration of AI infrastructure in technologically advanced nations creates barriers for emerging economies developing sovereign AI capabilities. This paper presents DSAIN (Distributed Sovereign AI Network), a federated learning framework providing the first joint convergence guarantee under Byzantine attacks and differential privacy simultaneously. Unlike prior work addressing these challenges in isolation, DSAIN introduces FedSov, achieving O(1/T) convergence while reducing communication by 78% through adaptive top-k compression with sparse noise injection. The ByzFed aggregation mechanism provides provable robustness against b < n/3 malicious participants, with formal proof that Byzantine adversaries cannot violate differential privacy guarantees (Theorem 4). Our primary contribution is the theoretical framework: the first convergence analysis unifying Byzantine resilience, differential privacy, and communication efficiency with explicit constants. Comprehensive experiments across 10 configurations on CIFAR-10 with ResNet18 validate the communication efficiency claims: DSAIN achieves 76.52% accuracy (matching FedAvg’s 76.69%) while transmitting only 22% of gradient information. Under Byzantine label-flipping attacks (10–20%), both methods maintain near-baseline performance, demonstrating inherent DNN robustness to label noise; we discuss implications for gradient manipulation attacks where geometric median filtering provides stronger guarantees. We identify α ≈ 0.5 as a critical heterogeneity threshold. Privacy experiments reveal that naive federated DP-SGD requires sophisticated noise calibration beyond standard implementations—an open challenge we characterize theoretically but defer practical solutions to future work. Code and reproducibility materials are available at https://github.com/TerexSpace/dsain-framework.

Submission Type: Long submission (more than 12 pages of main content)

Changes Since Last Submission: Revisions: - Main empirical claims are grounded in the primary experimental suite. - The manuscript distinguishes what is *proven* (theoretical guarantees) vs. what is *evaluated* (reported experiments). - DP is stated as "optional", and positioned as record-level DP. - The audit/provenance mechanism is explicitly optional to the core FL algorithm. # Reviewer fB55: -Comment 1 (Overclaim / align manuscript to evaluated artifacts) Response: We tightened the scope and grounded claims strictly in the evaluated artifacts. Specifically: - The Introduction includes explicit **Scope and Focus** and **Out of scope** blocks that constrain the contribution and avoid implying untested properties. - The Experiments section explicitly defines the primary evaluation as **E1–E12** and ties the reported claims to the corresponding tables/figures. - Any broader-coverage experiments are separated as an **Appendix rebuttal addendum** (E13–E14) and clearly labeled as optional. -Comment 2 (DP scope + privacy/utility caveats) Response: We clarified the privacy model and limited the claim accordingly. - DP is described as **optional record-level DP**, under an honest-but-curious server threat model, with explicit privacy–utility caveats. - We explicitly note that small privacy budgets can materially degrade utility, and therefore we do not claim universal performance under strict DP. -Comment 3 (Provenance/ledger implications overstated) Response: We revised Section 5 to position provenance as a **prototype audit trail** that is orthogonal to the core algorithm. - We explicitly avoid claims about consensus protocols, zero-knowledge proofs, or end-to-end ledger overhead. #Reviewer ZRFF Comment 1 (Experimental setup clarity: hyperparams, models, rounds, etc.) Response: We expanded the experimental setup description. - The Experiments section now lists concrete settings (rounds, participation rate, models, attacks, and which experiments use DP / defenses). - The complete experimental suite is explicitly enumerated as E1–E12 for auditability. Comment 2 (Robustness plot interpretation clarity) Response: We clarified the plot semantics. - Figure captions/labels are phrased to prevent misreading the curves (e.g., **accuracy vs. rounds** rather than ambiguous wording). # Reviewer gwaj Comment 1 (Compression + DP incompatibility: “dense noise breaks top‑k”) Response: We clarified the mechanism and the intended compatibility. - The algorithm text and privacy discussion now explain that noise can be applied **on the transmitted coordinates** (i.e., sparse/noise-on-top-k), preserving $\mathcal{O}(k)$ communication. - We do not claim this resolves all DP–compression subtleties; rather, we state the precise mechanism used/assumed and its limitations. Comment 2 (Supplementary proofs missing) Response: Full proofs are included in the Appendix. - The appendix starts with **Complete Proofs**, including restated results in starred environments, so the submission is self-contained. Comment 3 (Table/Figure numbering inconsistencies) Response: We added an explicit reconciliation practice. - Some comments appear to refer to an older draft numbering. In responses, we map reviewer references to the current manuscript’s numbering using the auto-index in `tools/table_figure_index.txt`. # Reviewer rsmW Comment 1 (Scope too broad; interactions not explored) Response: We tightened the scope and made interactions explicit. - The Introduction explicitly states the scope boundaries and highlights the intended interactions/tradeoffs between compression, robustness, and (optional) DP. - The provenance component is explicitly treated as **optional/orthogonal** rather than part of the core guarantee. Comment 2 (Related work not systematic enough) Response: We revised related work and added a positioning view. - Related work is organized across key axes and supported by a positioning table to make differences/comparability clearer. Comment 3 (Technical novelty unclear) Response: We clarified the novelty statement. - The contributions list now explicitly highlights the joint treatment (and bounds) for compression + robustness, and clearly separates theory vs. evaluated claims. Comment 4 (Experimental evaluation limited) Response: The primary suite is E1–E12; we additionally prepared broader-coverage addendum experiments. - The main claims remain tied to **E1–E12** (including stronger adversary coverage via ALIE and architectural variation via ViT-Tiny + ablations). - We also provide an **optional rebuttal addendum**: E13 (MobileNetV2 on CIFAR-10) and E14 (ResNet-18 on CIFAR-100), produced by `code/run_tmlr_rebuttal_experiments.py` and summarized via an **auto-generated table** included in the Appendix. Comment 5 (“PDF appears image-based”) Response: The locally built PDF contains extractable text. - We verified that the current build output is a standard PDF with selectable text; any image-based appearance likely came from a mismatched upload artifact.

Assigned Action Editor: ~Eduard_Gorbunov1

Submission Number: 6759

Loading