\section{Conclusion and Open Problems}
\label{sec: conclude}
In this work, we proved that the Riemannian gradient descent algorithm with spectral initialization can recover a rank-$r$ matrix $\target$ of size $d_1 \times d_2$ using $O(r(\complexity)\kappa^2)$ Gaussian measurements, which is optimal among fast non-convex methods. Furthermore, its convergence rate is independent of $\kappa$, making it computationally efficient even when $\target$ is ill-conditioned. 

Convex approaches based on nuclear norm minimization need only $\Omega (r(\complexity))$ samples in the matrix sensing scenario, while our result is suboptimal by a factor of $\kappa^2$. 
As a local search algorithm operating on the rank-$r$ matrix manifold, our RGD method's performance naturally depends on the geometric properties at the solution point $\target$. Existing analyses of the embedded manifold's local geometry (e.g., Lemma 5 in \citep{luo2022nonconvex}) demonstrate that the curvature at $\target$ scales with the condition number $\kappa$. This relationship is further evidenced in our two lemmas in \Cref{proof: lemmas RGD}, which show $\kappa$-dependence in tangent space perturbations.
This dependence on $\kappa$ is a common feature of fast non-convex methods, as shown in \cref{table: comparison}. 
Interestingly, our experiment result suggests that $m$ might decouple from $\kappa$, opening pathways for future research into improved initialization strategies or refined geometric analyses.
% However, all previous non-convex methods require at least this quadratic dependence on $\kappa$ in sample complexity. Whether the $\kappa^2$ term in the sample complexity can be removed remains an open problem. 

Moreover, the proof relies on a decoupling technique that critically depends on the rotational invariance of Gaussian random variables, posing an interesting and challenging direction for future research to establish optimal sample complexity in other settings, such as matrix completion and quantum state tomography.

