Approximating Gram Matrix Spectral Norm using Random Features with Applications in Efficient Norm Estimation
Abstract: This paper considers spectral norm estimation for the Gram matrix of $n$ data points. Efficient estimation methods like power iteration or
Nystrom directly work with the Gram matrix or submatrix and thus have their time complexities quadratically dependent on data size. This paper investigates an orthogonal direction to accelerate estimation through norm approximation. Building on the seminal work of random features for kernel methods, we propose to approximate the spectral norm of Gram matrix by the spectral norm of a random feature matrix, which is often much smaller and hence more efficient to work with. Original theoretical analysis suggests the approximation has an $\tilde{O}(n/\sqrt{q})$ absolute error and $\tilde{O}(\ln n/\sqrt{q})$ relative error with $q$ random features, close to the errors of prior methods. We apply the approximation to accelerate power iteration and Nystrom, improving their time complexities by replacing the quadratic dependence on data size with a linear dependence. Experimental results on two data sets show the accelerated methods significantly reduce the estimation time while being able to maintain the estimation accuracy.
Submission Length: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Stephen_Becker1
Submission Number: 3107
Loading