Spectral Guarantees for Adversarial Streaming PCA

Eric Price, Zhiyang Xun

Published: 01 Jan 2024, Last Modified: 17 May 2025FOCS 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: In streaming PCA, we see a stream of vectors $x_1, \ldots, x_n \in \mathbb{R}^d$ and want to estimate the top eigenvector of their covariance matrix. This is easier if the spectral ratio $\boldsymbol{R}=\lambda_{1}/\lambda_{2}$ is large. We ask: how large does $\boldsymbol{R}$ need to be to solve streaming PCA in $\boldsymbol{\tilde{O}(d)}$ space? Existing algorithms require $\boldsymbol{R=\tilde{\Omega}({d})}$. We show: • For all mergeable summaries, $\boldsymbol{R=\tilde{\Omega}(\sqrt{d})}$ is necessary. • In the insertion-only model, a variant of Oja's algorithm gets $\boldsymbol{o(1)}$ error for $\boldsymbol{R=O(\log n \log d)}$ • No algorithm with $\boldsymbol{o(d^{2})}$ space gets $\boldsymbol{o(1)}$ error for $\boldsymbol{R=O(1)}$. Our analysis is the first application of Oja's algorithm to adversarial streams. It is also the first algorithm for adversarial streaming PCA that is designed for a spectral, rather than Frobenius, bound on the tail; and the bound it needs is exponentially better than is possible by adapting a Frobenius guarantee.