\section{Experiments}\label{sec:experiments}

In this section, we provide experiments on synthetic and real-world data to validate our theory. For all experiments, we estimate variance of the entries of $\roja$ (see Eq~\ref{eq:hoeffding}) by scaling the output of Algorithm~\ref{alg:variance_estimation} by $\eta_{B}\bb{\lambda_1-\lambda_2}$.

\subsection{Synthetic data experiments}
\label{sec:synthetic_experiments}

We provide numerical experiments to compare Algorithm~\ref{alg:variance_estimation} ($\ojavarest$) with the multiplier bootstrap based algorithm proposed in \cite{lunde2021bootstrapping}. As discussed in Section~\ref{ssec:uncertainty_estimation}, given a dataset $\mathcal{D}_n := \left\{X_i\right\}_{i \in [n]}$, we choose $\tilde{v}$ for $\ojavarest$ as $\tilde{v} := \Oja\bb{\mathcal{D}_n, \eta_n, z/\norm{z}_2}$ for $z \sim \mathcal{N}\bb{0, I}$ and set $m_{1} = 3$, $m_{2}$ = $\log\bb{n}$, $N = n$. Given a variance estimate, $\hat{\sigma}^{2}_{\ojavarest}$, we construct a $\bb{1-\alpha}$-confidence interval as $\tilde{v} \pm z_{\frac{\alpha}{2}}\hat{\sigma}_{\ojavarest}$. 

For the bootstrap algorithm, using Algorithm 1 in the aforementioned paper, we use $b$ bootstrap samples to generate estimates $v^{*(1)}, \cdots, v^{*(b)}$ and measure the empirical variance by computing the average squared residual with $\vmain$. Again, given a variance estimate, $\hat{\sigma}^{2}_{\bootstrap}$, we construct a $\bb{1-\alpha}$-confidence interval as $\vmain \pm z_{\frac{\alpha}{2}}\hat{\sigma}_{\bootstrap}$. 

We also use the data generation process proposed in \cite{lunde2021bootstrapping} for our experiments. Specifically, we begin by generating independent samples $Z_{ij} \sim \operatorname{Uniform}(-\sqrt{3},\sqrt{3})$ for indices $i \in [n]$ and $j \in [d]$. Next, we define a positive semidefinite matrix $K$ with entries $K_{ij} = \exp(-c\,|i-j|)$ using the constant $c = 0.01$. With this matrix, we construct a covariance matrix $\Sigma$ via $\Sigma_{ij} = K(i,j)\,\sigma_i\,\sigma_j$, where the scaling factors are specified by $\sigma_i = 5\,i^{-\beta}$ for $\beta \in \left\{0.2, 1\right\}$. We finally transform the samples as $X_i = \Sigma^{1/2} Z_i$.


\begin{figure}[!hbt]
    \centering
    \includegraphics[width=0.6\linewidth]{images/computation_time_comparison.png} 
    \caption{Time taken by the bootstrap methods and the OjaVarEst algorithm. Experiments verify that our proposed algorithm is as fast as bootstrap with $b=1$.}
    \label{fig:computation_time}
\end{figure}

The first experiment (see Figure~\ref{fig:computation_time}) compares the computational performance of $\ojavarest$ with bootstrap to measure variance, varying the number of bootstrap samples, $b$, and recording performance for different values of $d$ with a fixed $n = 5000$ and $\beta = 1$. We note that the performance of our algorithm is computationally at par with bootstrap when using only 1 bootstrap sample, and is substantially better if the number of bootstrap samples increase. This is to be expected since for our algorithm, only two passes over the entire dataset suffice, whereas for bootstrap, $b$ bootstrap vectors are required to be maintained, which slows computation by a factor of $b$. Furthermore, it also requires $b$ times as much space to maintain $b$ different iterates, which may be costly in context of training large models.

\begin{table*}[!htb]
    \centering
    \renewcommand{\arraystretch}{1.1} % Adjust row height
    \resizebox{\textwidth}{!}{ % Shrink width to 90% of text width
    \begin{tabular}{l|cccc|cccc}
        \toprule
        & \multicolumn{4}{c|}{\textbf{Dist. 1} $(\beta = 1)$, Coordinate 1} 
        & \multicolumn{4}{c}{\textbf{Dist. 1} $(\beta = 1)$, Coordinate 2} \\
        \cmidrule(lr){2-5} \cmidrule(lr){6-9}
        ($n, d$) & $\ojavarest$ & BS ($b=1$) & BS ($b=10$) & BS ($b=20$) 
        & $\ojavarest$ & BS ($b=1$) & BS ($b=10$) & BS ($b=20$) \\
        \midrule
        2e3, 2e3 & 96.50\% & 65.00\% & 93.00\% & 95.00\% & 94.00\% & 69.50\% & 91.00\% & 91.50\% \\
        5e3, 2e3 & 95.50\% & 73.00\% & 91.50\% & 94.00\% & 95.50\% & 73.00\% & 89.00\% & 92.00\% \\
        1e4, 2e3 & 96.00\% & 69.00\% & 93.50\% & 94.50\% & 96.00\% & 71.50\% & 93.50\% & 96.00\% \\
        \midrule \midrule
        & \multicolumn{4}{c|}{\textbf{Dist. 2} $(\beta = 0.02)$, Coordinate 1} 
        & \multicolumn{4}{c}{\textbf{Dist. 2} $(\beta = 2)$, Coordinate 2} \\
        \cmidrule(lr){2-5} \cmidrule(lr){6-9}
        ($n, d$) & $\ojavarest$ & BS ($b=1$) & BS ($b=10$) & BS ($b=20$) 
        & $\ojavarest$ & BS ($b=1$) & BS ($b=10$) & BS ($b=20$) \\
        \midrule
        2e3, 2e3 & 94.50\% & 74.00\% & 87.00\% & 93.50\% & 94.00\% & 75.00\% & 86.50\% & 92.00\% \\
        5e3, 2e3 & 96.00\% & 71.00\% & 87.50\% & 92.00\% & 96.50\% & 72.50\% & 87.00\% & 93.00\% \\
        1e4, 2e3 & 94.00\% & 65.00\% & 95.00\% & 94.00\% & 94.50\% & 66.50\% & 94.50\% & 93.50\% \\
        \bottomrule
    \end{tabular}
    }
\caption{\label{tab:coverage_stats}Coverage statistics for our algorithm, $\ojavarest$, and the Bootstrap(BS) estimator, with varying bootstrap samples $(b = 1, 10, 20)$, data distributions ($\beta = 1, 0.02$) and sample sizes $(n = 2000, 5000, 10000)$ with a fixed dimension $d = 2000$.}
\end{table*}

The next experiment (Table~\ref{tab:coverage_stats}) compares the quality of the variance estimates of our algorithm, $\hat{\sigma}^{2}_{\ojavarest}$ with that of bootstrap $\hat{\sigma}^{2}_{\bootstrap}$ for different number of bootstrap samples, $b$, and distributions, $\beta$. We record the average coverage rate, which is the proportion of times the confidence interval provided by the algorithm contains the coordinate of the true eigenvector, for a target coverage probability of $95\%$ for the first two coordinates of the eigenvector. $\ojavarest$ performs similarly to Bootstrap with $b = 20$. However, as shown in Figure~\ref{fig:computation_time}, the bootstrap method is $20$ times slower. The time taken by bootstrap with $b=1$ is similar to $\ojavarest$ but has a significantly worse average coverage rate. 

Our final experiment compares the Algorithm~\ref{alg:variance_estimation} with $m_1 = 3$ to using just the mean ($m_1 = 1$). Even with the choice $m_1 = 3$, the uncertainty in variance estimation is reduced.

\begin{figure}[!hbt]
    \centering
    \begin{minipage}{0.6\columnwidth}
        \centering
        \includegraphics[width=\columnwidth]{images/variance_uncertainty_n_5k_d_2k_m1_1_m2_logn_bootstrap_samples_10_dist_1.png}
        \captionsetup{labelformat=empty}
        \caption*{(a) Mean (with $m_1 = 1$)}
    \end{minipage}%
    \hfill
    \begin{minipage}{0.6\columnwidth}
        \centering
        \includegraphics[width=\columnwidth]{images/variance_uncertainty_n_5k_d_2k_m1_3_m2_logn_bootstrap_samples_10_dist_1.png}
        \captionsetup{labelformat=empty}
        \caption*{(b) Median (with $m_1 = 3$)}
    \end{minipage}
    \caption{Comparison of Median and Mean in Algorithm~\ref{alg:variance_estimation} for $n = 5000$, $d = 2000$, $\beta = 1$, $b = 10$.}
    \label{fig:mean_median_comparison}
\end{figure}


\subsection{Real-world data experiments}
\label{sec:real_world_experiments}

We provide experiments on two real-world datasets in this section. For each dataset, we show the 95\% confidence intervals and plot the top 20 coordinates of the true offline eigenvector (red dot), used as a proxy for the ground truth.

\textbf{Time series+missing data}: The Human Activity Recognition (HAR) Dataset \citep{anguita2013public} contains smartphone sensor readings from 30 subjects performing daily activities (walking, sitting, standing, etc.). Each data instance is a 2.56-second window of inertial sensor signals represented as a feature vector. Here, $n=7352$ and $d=561$. For each datum, we also replace 10\% of features randomly by zero to simulate missing data. Even in this setting, which we do not analyze theoretically, most of the top 20 coordinates of the offline eigenvector are inside the 95\% CI returned by our algorithm (see Figure~\ref{fig:har_dataset}).

\begin{figure}[!hbt]
    \centering
    \begin{minipage}{0.6\columnwidth}
        \centering
        \includegraphics[width=\columnwidth]{images/final_HAR_dataset_total.png}
        \captionsetup{labelformat=empty}
        \caption*{(a)}
    \end{minipage}%
    \hfill
    \begin{minipage}{0.6\columnwidth}
        \centering
        \includegraphics[width=\columnwidth]{images/final_HAR_dataset_zoomed.png}
        \captionsetup{labelformat=empty}
        \caption*{(b)}
    \end{minipage}
    \caption{Uncertainty Estimation for HAR dataset ($n = 7352, d = 561$). The sin2 error of Oja’s algorithm is equal to 0.057 for this dataset. (a) plot of the eigenvector with 95\% confidence interval for all coordinates and (b) the same plot zoomed in on indices 170-310 for exposition.}
    \label{fig:har_dataset}
\end{figure}

\begin{table*}[!hbt]
  \centering
  \begin{tabular}{c c c c c c c c c c c}
    \toprule
    Class & 0 & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 \\
    \midrule
    $\sin^2$ error
      & 0.12 & 0.07 & 0.18 & 0.32 & 0.53 & 0.18 & 0.08 & 0.09 & 0.20 & 0.17 \\
    \bottomrule
  \end{tabular}
  \caption{  
    $\sin^2$ of the angle between the offline eigenvector and the subsampling
    eigenvector output by our algorithm, computed separately after filtering the
    MNIST data for each class.
  }
  \label{tab:mnist-sin2-error}
\end{table*}


\textbf{Image data}: We use the MNIST dataset~\citep{lecun1998gradient} of grayscale images of handwritten digits (0 through 9). Here, $n=60,000, d=784$, with each image normalized to a $28 \times 28$ pixel resolution. We see (Figure~\ref{fig:mnist_dataset}) that for the classes where Oja’s algorithm converges (small $\sin^2$ error in Table~\ref{tab:mnist-sin2-error}), most of the top 20 coordinates are inside their confidence intervals (CIs). Notable exceptions are classes 3 and 4, where several of the top 20 coordinates are not contained inside the corresponding CIs. This is expected because our theory is applicable when Oja’s algorithm converges.

\begin{figure}[!hbt]
    \centering
    \begin{minipage}{0.48\columnwidth}
        \centering
        \includegraphics[width=\columnwidth]{images/final_mnist_class0.png}
        \captionsetup{labelformat=empty}
        \caption*{(0)}
    \end{minipage}%
    \hfill
    \begin{minipage}{0.48\columnwidth}
        \centering
        \includegraphics[width=\columnwidth]{images/final_mnist_class1.png}
        \captionsetup{labelformat=empty}
        \caption*{(1)}
    \end{minipage}
    \begin{minipage}{0.48\columnwidth}
        \centering
        \includegraphics[width=\columnwidth]{images/final_mnist_class2.png}
        \captionsetup{labelformat=empty}
        \caption*{(2)}
    \end{minipage}%
    \hfill
    \begin{minipage}{0.48\columnwidth}
        \centering
        \includegraphics[width=\columnwidth]{images/final_mnist_class3.png}
        \captionsetup{labelformat=empty}
        \caption*{(3)}
    \end{minipage}%
    \hfill
    \begin{minipage}{0.48\columnwidth}
        \centering
        \includegraphics[width=\columnwidth]{images/final_mnist_class4.png}
        \captionsetup{labelformat=empty}
        \caption*{(4)}
    \end{minipage}%
    \hfill
    \begin{minipage}{0.48\columnwidth}
        \centering
        \includegraphics[width=\columnwidth]{images/final_mnist_class5.png}
        \captionsetup{labelformat=empty}
        \caption*{(5)}
    \end{minipage}%
    \hfill
    \begin{minipage}{0.48\columnwidth}
        \centering
        \includegraphics[width=\columnwidth]{images/final_mnist_class6.png}
        \captionsetup{labelformat=empty}
        \caption*{(6)}
    \end{minipage}%
    \hfill
    \begin{minipage}{0.48\columnwidth}
        \centering
        \includegraphics[width=\columnwidth]{images/final_mnist_class7.png}
        \captionsetup{labelformat=empty}
        \caption*{(7)}
    \end{minipage}%
    \hfill
    \begin{minipage}{0.48\columnwidth}
        \centering
        \includegraphics[width=\columnwidth]{images/final_mnist_class8.png}
        \captionsetup{labelformat=empty}
        \caption*{(8)}
    \end{minipage}%
    \hfill
    \begin{minipage}{0.48\columnwidth}
        \centering
        \includegraphics[width=\columnwidth]{images/final_mnist_class9.png}
        \captionsetup{labelformat=empty}
        \caption*{(9)}
    \end{minipage}%
    \hfill
    \caption{Uncertainty Estimation for MNIST dataset. The $\sin^{2}$ error of Oja’s algorithm for each class is provided in Table~\ref{tab:mnist-sin2-error}.}
    \label{fig:mnist_dataset}
\end{figure}

