Given the definitions of the Vendi score and the population Vendi, a relevant question is how many samples are required to accurately estimate the population Vendi using the Vendi score. To address this question, we first prove the following concentration bound on the vector of ordered eigenvalues $[\lambda_1,\ldots,\lambda_n]$ of the kernel matrix for a normalized kernel function. %We defer the proof of the theoretical results to the Appendix.

\begin{theorem}\label{Thm: eigenvalue convergence}
Consider a normalized kernel function $k$ satisfying $k(x,x)=1$ for every $x\in\mathcal{X}$. Let $\widehat{\boldsymbol{\lambda}}_n$ be the vector of sorted eigenvalues of the normalized kernel matrix $\frac{1}{n}K$ for $n$ independent samples $x_1,\ldots ,x_n\sim P_X$. If we define $\widetilde{\boldsymbol{\lambda}}$ as the vector of sorted eigenvalues of underlying covariance matrix $\widetilde{C}_X$, then if $n\ge 2+8\log (1/\delta)$, the following inequality holds with probability at least $1-\delta$:
\begin{equation*}
    \bigl\Vert \widehat{\boldsymbol{\lambda}}_n - \widetilde{\boldsymbol{\lambda}} \bigr\Vert_2 \, \le \, \sqrt{\frac{32\log\bigl(2/\delta\bigr)}{n}}
\end{equation*}
Note that in calculating the subtraction $\widehat{\boldsymbol{\lambda}}_n - \widetilde{\boldsymbol{\lambda}}$, we add $|d-n|$ zero entries to the lower-dimension vector, if the dimension of vectors $\widehat{\boldsymbol{\lambda}}_n$ and $\widetilde{\boldsymbol{\lambda}}$ do not match.
\end{theorem}
\begin{proof}
    We defer the proof to the Appendix.
\end{proof}

Theorem~\ref{Thm: eigenvalue convergence} results in the following corollary on a \emph{dimension-free convergence guarantee} for every $\mathrm{Vendi}_\alpha$ score with order $\alpha\ge 2$, including the RKE score (i.e. $\mathrm{Vendi}_2$).

\begin{corollary}\label{Corollary: Order greater than 2}
In the setting of Theorem \ref{Thm: eigenvalue convergence}, for every $\alpha\ge 2$ and $n\ge 2+8\log (1/\delta)$, the following bound holds with probability at least $1-\delta$:\vspace{-1mm}
\begin{align*}
    \Bigl\vert \mathrm{Vendi}_\alpha\bigl(x_1,\ldots,x_n\bigr)^{\frac{1-\alpha}{\alpha}} - \mathrm{Vendi}_\alpha\bigl(P_X\bigr)^{\frac{1-\alpha}{\alpha}} \Bigr\vert
    \le  &\sqrt{\frac{32\log\frac{2}{\delta}}{n}}
\end{align*}
Notably, for $\alpha=2$, we arrive at the following bound on the gap between the empirical and population RKE scores:
\begin{align*}
    \Bigl\vert \mathrm{RKE}\bigl(x_1,\ldots,x_n\bigr)^{-1/2} - \mathrm{RKE}\bigl(P_X\bigr)^{-1/2} \Bigr\vert \le  \sqrt{\frac{32\log\frac{2}{\delta}}{n}}
\end{align*}
\end{corollary}
\begin{proof}
    We defer the proof to the Appendix.
\end{proof}
Therefore, the bound in Corollary \ref{Corollary: Order greater than 2} holds regardless of the dimension of kernel feature map, indicating that the RKE score enjoys a universal convergence guarantee across all kernel functions. Next, we show that Theorem~\ref{Thm: eigenvalue convergence} implies the following corollary on a dimension-dependent convergence guarantee for order-$\alpha$ Vendi score with $1\le \alpha <  2$, including standard (order-$1$) Vendi score.
\begin{corollary}\label{Corollary: Finite Dimension}
In the setting of Theorem \ref{Thm: eigenvalue convergence}, consider a finite dimension kernel map where we suppose $\mathrm{dim}(\phi)=d<\infty$. (a) For $\alpha=1$, assuming $n\geq 32e^2\log(2/\delta)$, the following bound holds with probability at least $1-\delta$:
\begin{align*}
    &\Bigl\vert\, \log\bigl(\mathrm{Vendi}_1\bigl(x_1,\ldots,x_n\bigr)\bigr) - \log\bigl(\mathrm{Vendi}_1\bigl(P_X\bigr)\bigr)\, \Bigr\vert \\
    \le\:  &\sqrt{\frac{8d\log\bigl(2/\delta\bigr)}{n}}\log\Bigl(\frac{nd}{32\log(2/\delta)}\Bigr).
\end{align*}
(b) For every $1< \alpha< 2$ and $n\ge 2+8\log (1/\delta)$, the following bound holds with probability at least $1-\delta$:
\begin{align*}
    &\Bigl\vert\, \mathrm{Vendi}_\alpha\bigl(x_1,\ldots,x_n\bigr)^{\frac{1-\alpha}{\alpha}} - \mathrm{Vendi}_\alpha\bigl(P_X\bigr)^{\frac{1-\alpha}{\alpha}}\, \Bigr\vert \\
    \le\:  &\sqrt{\frac{32d^{2-\alpha}\log\bigl(2/\delta\bigr)}{n}}
\end{align*}
%We note that under a kernel function with finite dimension $d$, the above bound will be $\mathcal{O}(\sqrt{\frac{d}{n}}\log(nd))$. We note that this result will remain true for t-truncated population Vendi.
\end{corollary}
\begin{proof}
    We defer the proof to the Appendix.
\end{proof}

%We emphasize that the above concentration guarantee holds under a finite feature map dimension when $\alpha<2$. 
%On the other hand, if the order $\alpha$ of Vendi score is less than $2$, then Theorem \ref{Thm: eigenvalue convergence} provides a statistical convergence guarantee under a bounded dimension kernel map as shown in the following corollary.
Therefore, assuming a finite feature map $d<\infty$ and given an entropy order $1\le \alpha<2$, the above results indicate the convergence of the Vendi score to the underlying population Vendi given $n=O(d^{2-\alpha})$ samples. Observe that this result is consistent with our numerical observations of the convergence of Vendi score using the finite-dimension cosine similarity kernel in Figure~\ref{fig:kernel_convergence}. %Next, we discuss how to extend the above result to an infinite-dimension kernel map by defining the truncated population Vendi.    


