In this section, we will give a high-level overview of the key proof ideas for the main theorem. The full proof details can be found in Appendix \ref{Appendix:Main}. 

\textbf{Proof Structure.} The proof is largely divided into four steps. In Step 1, we examine how accurately the learner estimates the unstable subspace $E_u$ in Stage 1. We will show that $\Pi_1, P_1$ can be estimated up to an error of $\epsilon$, $\delta$ respectively within $T = O \left(k \log k + \log(n-k) - \log \epsilon - \log\gap\right)$ steps, where $\delta:=\sqrt{2k}\epsilon$. In Step 2, we examine how accurately the learner estimates $M_1$. We show that $M_1$ can be estimated up to an error of $3\norm{A}\delta$. In Step 3, we examine the estimation error of $B_\tau$ in Stage 3. Lastly, in Step 4, we eventually show that the $\tau$-hop controller output by Algorithm \ref{alg:LTS0} makes the system stable. %The proof is based on a detailed spectral analysis of the dynamical matrix of the closed-loop system. 

\textbf{Overview of Step 1.}
To upper bound the estimation errors in Stage 1, we use SVD to isolate the unstable subspace and use the Davis-Kahan Theorem to decouple the system dynamics from the noise perturbation. The bounds on $\norm{\Pi_1 - \hat{\Pi}_1}$ is shown in Theorem \ref{thm:projection}. 

\begin{theorem}
\label{thm:projection}
For a linear dynamic system with noise $x_{t+1} = A x_t + \eta_t$ satisfying \Cref{assumption:eigengap} and \Cref{assumption:pdf}, let $E_u$ be the unstable subspace of $A$, $k=\dim E_u$ be the instability index of the system and $\Pi_1$ be the orthogonal projector onto subspace $E_u$. Then for any $\epsilon > 0$, by running Stage 1 of Algorithm 1 for $T$ time steps, where 
\begin{equation*}
    T = O \left(k \log k + \log(n-k) - \log \epsilon - \log\gap\right),
\end{equation*}
 we get an estimation $\hat{\Pi}_1 = U^{(k)}(U^{(k)})^*$ with error $\norm{\hat{\Pi}_1 - \Pi_1} < \epsilon$. Here, the big-O notation only shows dependence on $k,n$ and $\epsilon$, while omitting dependence on $C, C_z, |\lambda_1|, |\lambda_k|,|\lambda_{k+1}|$, and $\theta$. 
\end{theorem}
The proof of \Cref{thm:projection} is deferred to \Cref{Appendix:proj_proof}. Overall, \Cref{thm:projection} gives a non-asymptotic bound on the speed in which the last $n-k$ singular values of $D$ decay. Therefore, even if the learner has no information on the exact value of $k$, the learner will find a large gap between the $k$-th and $(k+1)$-th singular value of $D$ as the first $k$ singular values grow exponentially and the last $(n-k)$ singular value decays exponentially, from which the learner can infer the value of $k$. 

\textbf{Overview of Step 2.} To upper bound the error in Stage 2, We upper bound the error in $\arg\min_{M_1} \sum_{t=0}^T \norm{(U^{(k)})^* x_{t+1} - M_1 (U^{(k)})^* x_t}^2$ and obtain the following proposition. %\guannan{I believe the proof of this part also contains a lot of novelty, so we should avoid saying ``modify .. of Hu 2022'' too often. Further, you are missing the lemma here? }\ziyi{fixed}
\begin{proposition}
\label{prop:G2}
    Under the premise of Theorem~\ref{thm:main}, we have
    \begin{equation*}
        \norm{\hat{M}_1^\tau - M_1^\tau} \leq 3 \tau \norm{A} \zeta_{\epsilon_1} (A)^2(|\lambda_1|+\epsilon_1)^{\tau-1} \delta,
    \end{equation*}
    where $\zeta_{\epsilon_1}(A)$ is constant for Gelfand's formula defined in \Cref{lemma:Gelfand}, and we recall $\delta$ is the estimation error for $P_1$.
\end{proposition}

The proof in this step and the related lemmas and propositions are deferred to Appendix~\ref{Appendix:ls}.

\textbf{Overview of Step 3.} To bound the error in Stage 3, we upper bound the error in each column of $B_{\tau}$. In particular, we show that \eqref{eqn:b} generates an estimation of $B_{\tau}$ with an error in the same order as $\delta$. The detail is left to \Cref{prop:G6} in Appendix~\ref{Appendix:boundingB}. 

\textbf{Overview of Step 4.} To analyze the stability of the closed-loop system, we shall first write out the closed-loop dynamics under the $\tau$-hop controller. Recall in Section \ref{section:tau-hop-control}, we have defined $\Tilde{u}_s, \Tilde{x}_s, \Tilde{y}_s$ to be the control input, state in $x$-coordinates, and state in $y$-coordinates in the $\tau$-hop control system, respectively. Using those notations, the learned controller is obtained from the estimation of $M_1^\tau$ and $B_\tau$ by the learner with any stabilization algorithm (e.g. LQR, pole-placement). 

Therefore, the closed-loop, the closed-loop $\tau$-hop dynamics should be 

\begin{equation}
\label{eqn:tau_hop_closed}
    \begin{split}
        \Tilde{y}_{s+1} &= \hat{L}
    \begin{bmatrix}
        \Tilde{y}_{1,s}\\ \Tilde{y}_{2,s}
    \end{bmatrix}
    + \sum_{i = 0}^{\tau - 1} P^{-1} A^{i} P \eta_{s\tau + i}
    \\
    &:=
    \hat{L} \Tilde{y}_s + \sum_{i = 0}^{\tau - 1} 
    \begin{bmatrix}
        P_1^* A^{i} \\ P_2^* A^{i}
    \end{bmatrix}
    \eta_{s\tau + i} ,
    \end{split}
\end{equation}
\normalsize
where 

\begin{equation}
\label{eqn:L_hat}
    \begin{split}
        \hat{L} &:= \begin{bmatrix}
        M_1^\tau + P_1^* A^{\tau-1} B \hat{K}_1 \hat{P}_1^* P_1 &
        \Delta_\tau + P_1^* A^{\tau-1}B \hat{K}_1 \hat{P}_1^* P_2 \\
        P_2^* A^{\tau - 1} B \hat{K}_1 \hat{P}_1^* P_1 &
        M_2^\tau + P_2^* A^{\tau - 1} B \hat{K}_1 \hat{P}_1^* P_2
        \end{bmatrix}
        \\
        &:= \begin{bmatrix}
            \hat{L}_{1,1} & \hat{L}_{1,2} \\
            \hat{L}_{2,1} & \hat{L}_{2,2}
        \end{bmatrix} .
    \end{split}
\end{equation}
\normalsize

We will show the above system to be ultimately bounded (i.e. $\rho(\hat{L}_\tau) < 1$). Note that $\hat{L}_\tau$ is given by a 2-by-2 block form, and we can utilize the following lemma for the spectral analysis of block matrices. 

\begin{lemma}
    For block matrices $A = \begin{bmatrix}
        A_1 & 0 \\ 0 & A_2
    \end{bmatrix}$, $E = \begin{bmatrix}
        0 & E_{12} \\ E_{21} & 0
    \end{bmatrix}$, 
    the spectral radii of $A$ and $A+E$ differ by at most $|\rho(A+E) - \rho(A)| \leq \chi(A+E) \norm{E_{12}}\norm{E_{21}}$, where $\chi(A+E)$ is a constant. 
\end{lemma}

The proof of the lemma can be found in existing literature such as \cite{Nakatsukasa18}. Therefore, we need to ensure the stability of the diagonal blocks of $\hat{L}$ and upper-bound the norms of the off-diagonal blocks via estimation of factors appearing in these blocks.
Complete proofs can be found in Appendices \ref{Appendix:Main}. 
