Abstract: Non-smooth communication-efficient federated optimization is crucial for many practical machine
learning applications, yet it remains largely unexplored theoretically. Recent advancements in
communication-efficient methods have primarily focused on smooth convex and non-convex regimes,
leaving a significant gap in our understanding of the more challenging non-smooth convex setting.
Additionally, existing federated optimization literature often overlooks the importance of efficient
server-to-worker communication (downlink), focusing primarily on worker-to-server communication
(uplink). In this paper, we consider a setup where uplink communication costs are negligible and
focus on optimizing downlink communication by improving the efficiency of recent state-of-the-art
downlink schemes such as EF21-P [Gruntkowska et al., 2023] and MARINA-P [Gruntkowska et al.,
2024] in the non-smooth convex setting. We address these gaps through several key contributions.
First, we extend the non-smooth convex theory of EF21-P [Anonymous, 2024], originally developed
for single-node scenarios, to the distributed setting. Second, we extend existing results for MARINA-P
to the non-smooth convex setting. For both algorithms, we prove an optimal $\mathcal{O}(1/\sqrt{T})$ convergence
rate under standard assumptions and establish communication complexity bounds that match those
of classical subgradient methods. Furthermore, we provide theoretical guarantees for both EF21-P
and MARINA-P under constant, decreasing, and adaptive (Polyak-type) stepsizes. Our experiments
demonstrate that MARINA-P, when used with correlated compressors, outperforms other methods not
only in smooth non-convex settings (as originally shown by Gruntkowska et al. [2024]) but also in
non-smooth convex regimes. To the best of our knowledge, this work presents the first theoretical
results for distributed non-smooth optimization incorporating server-to-worker compression, along
with comprehensive analysis for various stepsize schemes.
Loading