
% Federated Learning (FL) is an emerging learning scheme that allows distributed clients to learn a model 
% (can be either a convex function or a non-convex function) 
% without sending the local private data to each other. Langevin Dynamics is a fundamental tool in the field of sampling and has attracted a lot of attention over the last decades. To the best of our knowledge, the theoretical guarantees concerning Langevin Dynamics method with multi-step updates are remained challenging. The major challenging for showing provable result for Langevin dynamics method with multi-step updates in the FL setting is, in each iteration the update direction is even not gradient direction. In this work, to overcome that challenge, we propose some new \Wei{we are not new, how about we adopt the synchronous coupling method to prove the convergence of federated averaging Langevin dynamics for strongly log-concave distributions.} coupling methods to deal with. We also observed that there are certain assumptions cannot be avoided in the FL optimization but can be removed under FL sampling. Further, we also show the trade-off between noise and privacy. We believe our work open a new interesting direction for both sampling and federated learning.


% v2
% We propose a federated averaging Langevin algorithm (FA-LD) for uncertainty quantification and mean predictions  with distributed clients. In particular, we generalize beyond normal posterior distributions and consider a wide class of models. We develop theoretical guarantees for the convergence of FA-LD in strongly convex scenarios with non-i.i.d data. Using the convergence analysis, we study how the heterogeneity of data, the injected noise, and the varying learning rates affect the convergence. In particular \Wei{two in particular}, we examine in our FA-LD algorithm both independent and correlated noise used over different clients. We observe that although the posterior distribution can always be approximated in Wasserstein-2 distance, there is a trade-off between federation and communication cost. Important to our approach is the fact that this trade-off does not deteriorate with the injected noise in Langevin dynamics. As local devices may become inactive in the federated network, we also consider different averaging schemes where only partial device updates are available. In such a case, we discover that there is an additional bias that does not decay to zero \Wei{I am not 100\% confident about this part. How about: we also show convergence results based on different averaging schemes where only partial device updates are available.}.

% v3. 
We propose a federated averaging Langevin algorithm (FA-LD) for uncertainty quantification and mean predictions  with distributed clients. In particular, we generalize beyond normal posterior distributions and consider a general class of models. We develop theoretical guarantees for FA-LD for strongly log-concave distributions with non-i.i.d data and study how the injected noise and the stochastic-gradient noise, the heterogeneity of data, and the varying learning rates affect the convergence. Such an analysis sheds light on the optimal choice of local updates to minimize communication cost. Important to our approach is that the communication efficiency does not deteriorate with the injected noise in the Langevin algorithms. In addition, we examine in our FA-LD algorithm both independent and correlated noise used over different clients. We observe there is a trade-off between the pairs among communication, accuracy, and data privacy. As local devices may become inactive in federated networks, we also show convergence results based on different averaging schemes where only partial device updates are available. In such a case, we discover an additional bias that does not decay to zero. %Empirical experiments are conducted to verify our result.



