\subsection{Proof of Theorem \ref{thm:2}}
For time $t-1$, after the local updating round, the cluster parameters can be expressed as:
\begin{equation}
    \mathbf{C}^{t-1'} = \mathbf{C}^{t-1} - \eta_t \mathbf{G}^{t-1}
\end{equation}

After the communication round, the parameters can be expressed as:
\begin{equation}
    \mathbf{C}^{t} = \mathbf{C}^{t-1'}\mathbf{W}^{t-1} = \mathbf{C}^{t-1}\mathbf{W}^{t-1} - \eta_t \mathbf{G}^{t-1}\mathbf{W}^{t-1}
\end{equation}

Thus, recursively expanding the parameters at time $t$ back to $l\beta$, we can get the final form:
\begin{equation}
\begin{aligned}
\mathbf{C}^{t} & = \mathbf{C}^{l\beta}\prod_{m=l\beta}^{t-1}\mathbf{W}^{m}-\sum_{m=l\beta}^{t-1}\left(\eta_t\mathbf{G}^{m} \prod_{r=t-1}^{m}\mathbf{W}^{r}\right)
\end{aligned}
\end{equation}