A Geometric Analysis of PCA

Published: 18 Sept 2025, Last Modified: 29 Oct 2025NeurIPS 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: PCA, learning theory, CLT, self-concordance, Grassmannian, block rayleigh quotient
TL;DR: We prove the asymptotic normality of PCA on the Grassmannian, and derive a tight non-asymptotic bound on its excess risk using self-concordance.
Abstract: What property of the data distribution determines the excess risk of principal component analysis? In this paper, we provide a precise answer to this question. We establish a central limit theorem for the error of the principal subspace estimated by PCA, and derive the asymptotic distribution of its excess risk under the reconstruction loss. We obtain a non-asymptotic upper bound on the excess risk of PCA that recovers, in the large sample limit, our asymptotic characterization. Underlying our contributions is the following result: we prove that the negative block Rayleigh quotient, defined on the Grassmannian, is generalized self-concordant along geodesics emanating from its minimizer of maximum rotation less than $\pi/4$.
Primary Area: Theory (e.g., control theory, learning theory, algorithmic game theory)
Submission Number: 10022
Loading