In this section, we introduce some relevant math background used in this paper. The notation of this section is independent of the rest of the paper. 

\begin{theorem}[Davis-Kahan]
    Let $A$ be an $n \times n$ Hermitian matrix, and suppose we have the following spectral decomposition for $A$
    \begin{equation*}
        A = \sum_{i=1}^n \lambda_i u_i u_i^*,
    \end{equation*}
    where $\lambda_i$'s are the eigenvalues of $A$ such that $\lambda_1 > \dots > \lambda_n$, and $u_i$'s are corresponding eigenvectors. Let $H$ be another $n \times n$ perturbation matrix, and the spectral decomposition of $A + H$ is
    \begin{equation*}
        A + H = \sum_{i=1}^n \mu_i v_i v_i^*.
    \end{equation*}
    Define 
    \begin{equation*}
        P = \sum_{i = 1}^k u_i u_i^* := U U^*
    \end{equation*}
    to be the orthogonal projection operator to the $k$-dimensional eigenspace spanned by $u_1 \dots, u_k$. Similarly, define $Q = \sum_{i=1}^k v_i v_i^* := V V^*$. 

    Suppose there exists $\delta > 0$, such that $|\lambda_i - \mu_j| > \delta$ for all $i \in \{1,\dots,k\}, j \in \{k+1,\dots,n\}$, then the operator norm of $\norm{P-Q}_{op}$ satisfy
    \begin{equation*}
        \norm{P-Q}_{op} \leq \norm{P-Q}_F \leq \frac{\sqrt{2k}\norm{H}_{op}}{\delta},
    \end{equation*}
    where $\norm{\cdot}_F$ denotes the Frobenius norm. 
\end{theorem}
This is a relatively common theorem, and the proof detail can be found at, for instance, \cite{Davis-Kahan}.

\begin{lemma}[Gelfand's formula]
\label{lemma:Gelfand}
    For any square matrix $X$, we have
    \begin{equation*}
        \rho(X) = \lim_{t \rightarrow \infty} \norm{X^t}^{1/t}.
    \end{equation*}
    In other words, for any $\epsilon > 0$, there exists a constant $\zeta_{\epsilon}(X)$ such that
    \begin{equation*}
        \sigma_{\max}(X^t) = \norm{X} \leq \zeta_{\epsilon}(X)(\rho(X) + \epsilon)^t. 
    \end{equation*}
    Further, if $X$ is invertible, let $\lambda_{\min}(X)$ denote the eigenvalue of $X$ with minimum modulus, then 
    \begin{equation*}
        \sigma_{\min}(X^t) \geq \frac{1}{\zeta_{\epsilon}(X^{-1})} \left(\frac{|\lambda_{\min}(X)|}{1 + \epsilon |\lambda_{\min}(X)|}\right)^t. 
    \end{equation*}
\end{lemma}
The proof can be found in existing literatures (e.g. \cite{Roger12}. 