Abstract: The recently developed matrix-based Rényi's <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$\alpha$</tex-math></inline-formula> -order entropy enables measurement of information in data simply using the eigenspectrum of symmetric positive semi-definite (PSD) matrices in reproducing kernel Hilbert space, without estimation of the underlying data distribution. This intriguing property makes this new information measurement widely adopted in multiple statistical inference and learning tasks. However, the computation of such quantity involves the trace operator on a PSD matrix <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$G$</tex-math></inline-formula> to power <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$\alpha$</tex-math></inline-formula> (i.e., <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">${\text{tr}}(G^\alpha)$</tex-math></inline-formula> ), with a normal complexity of nearly <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$\mathcal {O}(n^{3})$</tex-math></inline-formula> , which severely hampers its practical usage when the number of samples (i.e., <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$n$</tex-math></inline-formula> ) is large. In this work, we present computationally efficient approximations to this new entropy functional that can reduce its complexity to even significantly less than <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$\mathcal {O}(n^{2})$</tex-math></inline-formula> . To this end, we leverage the recent progress on Randomized Numerical Linear Algebra, developing Taylor, Chebyshev and Lanczos approximations to <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">${\text{tr}}(G^\alpha)$</tex-math></inline-formula> for arbitrary values of <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$\alpha$</tex-math></inline-formula> by converting it into a matrix-vector multiplication problem. We also establish the connection between the matrix-based Rényi's entropy and PSD matrix approximation, which enables exploiting both clustering and block low-rank structure of <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$G$</tex-math></inline-formula> to further reduce the computational cost. We theoretically provide approximation accuracy guarantees and illustrate the properties for different approximations. Large-scale experimental evaluations on both synthetic and real-world data corroborate our theoretical findings, showing promising speedup with negligible loss in accuracy.
0 Replies
Loading