\begin{abstract}
We study the problem of recovering an unknown $d_1 \times d_2$ rank-$r$ matrix from $m$ random linear measurements. Convex methods achieve the optimal sample complexity $m = \Omega(r(d_1 + d_2))$ but are computationally expensive. Non-convex approaches, while more computationally efficient, often require suboptimal sample complexity $m = \Omega(r^2(d_1 + d_2))$. A recent advance achieves $m = \Omega(rd_1)$ for a fast non-convex approach but relies on the restrictive assumption of positive semidefinite (PSD) matrices and suffers from slow convergence in ill-conditioned settings. Bridging this gap, we show that Riemannian gradient descent (RGD) achieves both optimal sample complexity and computational efficiency without requiring the PSD assumption. Specifically, for Gaussian measurements, RGD exactly recovers the low-rank matrix with $m = \Omega(r(d_1 + d_2))$, matching the information-theoretic lower bound, and converges linearly to the global minimum with an arbitrarily small convergence rate. 
\end{abstract}
