\section{Outline of the Proof}\label{opop}

In this section, we outline the main steps for proving the results stated in Section \ref{main}. Further details are provided in Appendices \ref{A1} and \ref{A2}.

\subsection{Proofs of Theorem \ref{thm: OMWU fails}}
 According to the update rule of (\ref{OMWU}), $\textbf{x}_{1}^{t+2}$ and $\textbf{x}_{2}^{t+2}$  are determined by $\textbf{x}_{1}^{t+1}$, $\textbf{x}_{2}^{t+1}$, $\textbf{x}_{1}^{t}$ and $\textbf{x}_{1}^{t}$. The mixed strategies of both players in the periodic game defined in Theorem~\ref{thm: OMWU fails} lie within the simplex $\Delta_2$,  indicating that $\textbf{x}_{j,2}^t$ can be determined by $\textbf{x}_{j,1}^t$ for $j=1,2$ through the equation $\textbf{x}_{j,2}^t = 1- \textbf{x}_{j,1}^t$. Thus, by tracing the evolution of $\textbf{x}_{j,1}^t$, we can trace the evolution of players' mixed strategies. The dynamics of (OMWU) in Theorem~\ref{thm: OMWU fails} can be described equivalently by the mappings :
 \begin{align*}
     \CG_i : 
     (\tb{x}_{1,1}^{t},\tb{x}_{1,1}^{t+1},\tb{x}_{2,1}^{t},\tb{x}_{2,1}^{t+1}) \to (\tb{x}_{1,1}^{t+1},\tb{x}_{1,1}^{t+2},\tb{x}_{2,1}^{t+1},\tb{x}_{2,1}^{t+2}),
 \end{align*}
$i = 1,2$, where $\CG_1$ is the update rule for even $t$ and $\CG_2$ is the update rule for odd $t$. Furthermore, we have
  \begin{align*}
     (\CG_1\circ\CG_2)^t & \left( (\tb{x}_{1,1}^{-1},\tb{x}_{1,1}^{0},\tb{x}_{2,1}^{-1},\tb{x}_{2,1}^{0}) \right)\\
     &=(\tb{x}_{1,1}^{2t-1},\tb{x}_{1,1}^{2t},\tb{x}_{2,1}^{2t-1},\tb{x}_{2,1}^{2t}),
 \end{align*}
 the divergence of (\ref{OMWU}) can thus be deduced from the divergence of $\CG_1\circ\CG_2$.

With the above construction, the proof of Theorem \ref{thm: OMWU fails} is divided into three parts. Firstly, Proposition~\ref{prop: OMWU not converge} demonstrates the existence of an initial condition in any arbitrary small neighborhood of equilibrium that does not converge to it.
In the second part, Proposition~\ref{prop: OMWU fails} establishes that the KL-divergence monotonically increases until either $\textbf{x}^t_{1}$ or $\textbf{x}^t_{2}$ approaches sufficiently close to the boundary. 
Lastly, in the third part, Proposition~\ref{prop: converge to boundary} illustrates that any point close to the boundary will ultimately converge to it, which lead the KL-divergence tends to infinity.


    \begin{restatable}{prop}{OMWUnotconverge}\label{prop: OMWU not converge}
        In any arbitrary small neighbourhood $\CU$ of the equilibrium $(\tb{x}_1^*,\tb{x}_2^*)$ of (\ref{2-periodic game_m}), there exists an initial condition in $\CU$ such that the trajectory of (OMWU) starting from this initial condition will not converge to $(\tb{x}_1^*,\tb{x}_2^*)$.
    \end{restatable}

We prove Proposition \ref{prop: OMWU not converge} by calculating the eigenvalues of the Jacobi matrix of $\CG_2\circ\CG_1$ at the equilibrium, which is a standard technique used in the local analysis of a dynamical system \citep{galor2007discrete}. Similar methods are also used in proving the last-iterate convergence results for several learning algorithms in time-independent games \citep{daskalakis2018last,fasoulakis2022forward}.


    
%    The proof of Proposition~\ref{prop: OMWU not converge} can be derived from the fact that the Jacobi matrix of $\CG_2\circ\CG_1$ at the equilibrium possesses eigenvalues larger than 1. By employing Proposition \ref{decomposition}, it follows that the corresponding eigenvectors will not converge to the equilibrium.

        \begin{restatable}{prop}{KLincreasing}
        \label{prop: OMWU fails}
			Under the same conditions stated in Theorem~\ref{thm: OMWU fails}, there exists a constant $c$, which is independent of $\eta$, such that for any $t\ge 3$, 
   \begin{align*}
       \KL ((\textbf{x}_1^*,\textbf{x}_2^*),(\textbf{x}_1^{t+2},\textbf{x}_2^{t+2}))-
       \KL ((\textbf{x}_1^*,\textbf{x}_2^*),(\textbf{x}_1^{t},\textbf{x}_2^{t}))
       \ge c\eta^3
   \end{align*}
  
        unless either $x^t_{1}$ or $x^t_{2}$ is $\mathrm{O}(\eta^\frac{1}{2})$-close to the boundary.
		\end{restatable}
  
We prove Proposition~\ref{prop: OMWU fails} by directly tracing the trajectories of mixed strategies as they evolve under (OMWU). Proposition~\ref{prop: OMWU fails} also implies that if the current mixed strategies used by players are far from boundary of the simplex constrains, under each iterate of (OMWU), they will steadily approach the boundary.
  
%The proof of Proposition~\ref{prop: OMWU fails} involves of a detailed analysis for the changing probability of strategies under different cases of value of $\tb{x}_1^0 $ and $ \tb{x}_2^0$. The relationship between the KL-divergence and the probability of strategies determines the increase in KL-divergence. 

    \begin{restatable}{prop}{OMWUlocallyconverging}		\label{prop: converge to boundary}
			There exists a neighborhood $\CW$ of the boundary of the simplex constrains such that for all $(\textbf{x}_{1,1}^{-1},\textbf{x}_{1,1}^{0},\textbf{x}_{2,1}^{-1},\textbf{x}_{2,1}^{0})\in \CW$, we have
   \begin{align*}
          \lim_{n \to \infty} \KL ((\tb{x}_{1,1}^*,&\tb{x}_{1,1}^*,\tb{x}_{2,1}^*,\tb{x}_{2,1}^*), \\ &(\CG_1 \circ \CG_2)^n(\textbf{x}_{1,1}^{-1},\textbf{x}_{1,1}^{0},\textbf{x}_{2,1}^{-1},\textbf{x}_{2,1}^{0})) 
          =+\infty.
   \end{align*}
		\end{restatable}

By combining Proposition \ref{prop: converge to boundary} and Proposition \ref{prop: OMWU fails}, we can obtain a comprehensive understanding on the dynamics of (OMWU) in the games defined in Theorem \ref{thm: OMWU fails}. Firstly, when the mixed strategies are far away from the boundary of the simplex, they will rapidly 
approach the boundary of the simplex (Proposition \ref{prop: OMWU fails}). Secondly, once they are close enough to the boundary, they will be attracted to it, causing the KL-divergence tend to infinity (Proposition \ref{prop: converge to boundary}). 

We prove Proposition \ref{prop: converge to boundary} by analyzing the eigenvalues and the corresponding stable eigenspace of the Jacobian matrix of $\CG_1 \circ \CG_2$ at its fixed points. Interestingly, we find that these fixed points form a continuous curve, and none of the points on this curve are equilibria. This phenomenon is novel in periodic games because in time-independent games, the dynamical system modeling the learning algorithm usually only has discrete equilibrium points as fixed points \citep{daskalakis2018last}. In Figure (\ref{Fixed points}), we present these curves composed of the fixed points  of $\CG_1 \circ \CG_2$ for different step sizes. 



  
%The proof of Proposition~\ref{prop: converge to boundary} follows from the properties of $\CG_1\circ \CG_2$ on the boundary.
%There exists continuous fixed points of $\CG_1\circ \CG_2$ on the boundary.
%The Jacobi matrix of the composition $\CG_1\circ \CG_2$ has eigenvalues that are all less than or equal to $1$ at each fixed point, and the eigenvector corresponding to $1$ doesn't affect the convergence to the boundary. Consequently, this leads to the convergence towards the boundary. In Figure (\ref{Fixed points}), we present instances of fixed points of the composition $\CG_1\circ \CG_2$, $(\textbf{x}_{1,2}^t,\textbf{x}_{1,2}^{t+1},\textbf{x}_{2,2}^t,\textbf{x}_{2,2}^{t+1})=(0,0,a,\frac{a\cdot e^{3\eta}}{a\cdot e^{3\eta}+(1-a)})$ for $a\in(0,1)$ under different values of $\eta$, when $\textbf{x}_1^t$ reaches the boundary $(0,1)$. 

\begin{figure}[h]
    \centering
    \includegraphics[width=0.43\textwidth]{fixed_points.png}
    \caption{Curves composed of the fixed points  of $\CG_1 \circ \CG_2$.}
    \label{Fixed points}
\end{figure}





\subsection{Proofs of Theorem \ref{T2}}

Recall that in the (\ref{EMWU}) algorithm, each update from $(\tb{x}^t_1,\tb{x}^t_2)$ to $(\tb{x}^{t+1}_1,\tb{x}^{t+1}_2)$ is divided into two steps: Firstly, an intermediate step
$(\tb{x}^{t+\frac{1}{2}}_1,\tb{x}^{t+\frac{1}{2}}_2)$ is calculated based on the players' payoff in the t-th round of the game. Secondly, $(\tb{x}^t_1,\tb{x}^t_2)$ and 
the intermediate step are used together to calculate 
$(\tb{x}^{t+1}_1,\tb{x}^{t+1}_2)$. Since we are discussing the periodic game, the update rule of (\ref{EMWU}) in the current round also depends on the special payoff matrix $A_i$ for $i \in [\CT]$ in that same round. We use 
\begin{align*}
    \CF_{i} : \Delta_m \times \Delta_n &\to \Delta_m \times \Delta_n \\
    (\tb{x}^t_1,\tb{x}^t_2) &\to (\tb{x}^{t+1}_1,\tb{x}^{t+1}_2)
\end{align*}
to denote the dynamical system determined by the  (\ref{EMWU}) algorithm with payoff matrix
$A_i$. Thus the algorithm is described by the $\CT$-periodic dynamical system defined by $\{\CF_i \}^{\CT}_{i =1}$.

From Proposition \ref{attrat}, for such a periodic dynamical system, we can study its convergence property by analyzing the corresponding non-autonomous system defined as follows:
\begin{align*}
    \tilde{\CF}_i = \CF_{i+\CT-1} \circ \CF_{i+\CT-2} \circ ...\circ \CF_{i+1} \circ \CF_{i},
\end{align*}
where $i \in [\CT]$. Furthermore, the periodic system converges to $(\tb{x}_1^*,\tb{x}_2^*)$ if  $\tilde{\CF}_i$ converge to $(\tb{x}_1^*,\tb{x}_2^*)$ for all $i$. Thus, the main step to prove Theorem \ref{T2} is to establish convergence results for $\tilde{\CF}_i$. 


For a fixed $(\tb{x}_1,\tb{x}_2)$, $\KL \left((\tb{x}_1,\tb{x}_2),(\tb{x}_1',\tb{x}_2')\right) = 0$ if and only if $ (\tb{x}_1',\tb{x}_2') = (\tb{x}_1,\tb{x}_2)$. Thus to prove $\Tilde{\CF}_i$ converges to the equilibrium $(\tb{x}_1^*,\tb{x}_2^*)$, it is enough to prove 
\begin{align*}
    \lim_{n \to \infty} \KL \left( (\tb{x}_1^*,\tb{x}_2^*), \CF^n_i (\tb{x}_1,\tb{x}_2)\right) = 0, 
\end{align*}
for arbitrary initial point $(\tb{x}_1,\tb{x}_2)$. The following proposition states that in a periodic zero-sum game, the KL-divergence between the equilibrium and the current strategies decreases under an iteration of $\tilde{\CF}_i$.

\begin{prop}\label{decreasing_KL1} Under the same assumption as Theorem \ref{T2}, for any $i \in [\CT]$ and $n$, if the step size $\eta$ in (Extra-MWU) satisfies $\eta \cdot \max_{t \in [\CT]}\lVert A_t \lVert < 1$, then we have
		\begin{align*}
			\KL & \left( (\tb{x}_1^*,\tb{x}_2^*),  \tilde{\CF}_i (\tb{x}_1^{n\CT+i},  \tb{x}_2^{n\CT+i})  \right)  \\
  & \le \KL\left( (\tb{x}_1^*,\tb{x}_2^*), (\tb{x}_1^{n\CT+i},\tb{x}_2^{n\CT+i})  \right),
		\end{align*}
		and the equal holds if and only if $ (\tb{x}_1^{n\CT+i},\tb{x}_2^{n\CT+i})=(\tb{x}_1^*,\tb{x}_2^*)$.
	\end{prop}

The proof of Proposition \ref{decreasing_KL1} relies on a detailed analysis of the behavior of the KL-divergence under two-step method of proof of (MWU). Such a result, where the KL-divergence decreases, also plays an important role in proving convergence results for both (OMWU) and (Extra-MWU) in static games \citep{mertikopoulos2018optimistic,daskalakis2018last,fasoulakis2022forward}. 

Proposition \ref{decreasing_KL1} is not sufficient to guarantee the convergence of $\tilde{\CF}_i$ to the equilibrium, as the rate at which the KL-divergence decreases can be slow when the current strategy is close to the equilibrium. To address this issue, we employ the following LaSalle invariance principle.

\begin{prop}[\cite{la1976stability}]\label{DLIP1} Let  $G$ be any set in $\BR^m$. Consider a difference equations system defined by a map $T : G \to G$ that is well defined for any $x \in G$ and continuous at any $x \in G$. Suppose there exists a scalar map $V : \bar{G} \to \BR$ satisfying 
\begin{itemize}
    \item $V(x)$ is continuous at any $x \in \bar{G}$,
    \item $V\left(T(x)\right) - V(x) \le 0$ for any $x \in G$.
\end{itemize}
    For any $x_0 \in G$, if the solution to the following initial-value problem
        $x(n+1) = T(x(n)), x(0) = x_0,$
satisfying that $\{ x(n) \}^{\infty}_{n=1}$ is bounded and $x(n) \in G$ for any $n \in \BN$, then there exists some $c \in \BR$ such that 
\begin{align*}
    x(n) \to M \cap V^{-1}(c)
\end{align*}
as $n \to \infty$, where $V^{-1}(c) = \{ x \in \BR^m \lvert V(x) = c \}$, and $M$ is the largest invariant set in 
\begin{align*}
    E= \{x \in G \ \lvert \  \ V(T(x))-V(x)=0 \}.
\end{align*}
\end{prop}

In our case, $\tilde{\CF}_i$ plays the role of $T$, $\Delta_m \times \Delta_n$ plays the role of $G$, and according to Proposition \ref{decreasing_KL1}, KL-divergence can serve as the scalar map $V$. The LaSalle invariance principle guarantees that the limit point under the iteration of 
$\tilde{\CF}_i$ lies in the set consists of points $(\tb{x}_1,  \tb{x}_2)$ that makes 
\begin{align*}
    \KL   \left( (\tb{x}_1^*,\tb{x}_2^*),  \tilde{\CF}_i (\tb{x}_1,  \tb{x}_2)  \right) 
     =  \KL  \left( (\tb{x}_1^*,\tb{x}_2^*),  (\tb{x}_1,  \tb{x}_2)  \right)
\end{align*}
Moreover, according to Proposition \ref{decreasing_KL1}, the only possible such $(\tb{x}_1,  \tb{x}_2)$ is the equilibrium point, this finish the proof that under the iteration of $\tilde{\CF}_i$, 
all initial points in $\Delta_m \times \Delta_n$ will converge to the equilibrium of the periodic game. Combining this with Proposition \ref{attrat}, we can conclude that (Extra-MWU) will converge to the equilibrium.

