Keywords: Matrix completion, Subspace recovery, Optimization landscape, Ultrasparse
Abstract: Matrix completion is a classical problem that has received recurring interest from a wide range of fields. In this paper, we revisit this problem in an *ultra-sparse sampling regime*, where each entry of an unknown, $n\times d$ matrix $M$ (with $n \ge d$) is observed independently with probability $p = C / d$, for a fixed constant $C \ge 2$. This setting is motivated by applications involving large, sparse panel datasets, where the number of rows (users) far exceeds the number of columns (items). When each row contains only $C$---fewer than the rank of $M$---accurate imputation of $M$ is impossible. Instead, we focus on estimating the *row span* of $M$, or equivalently, the averaged *second-moment matrix* $T = M^{\top} M / n$.
The empirical second-moment matrix computed from observational data exhibits non-random and sparse missingness. We propose an *unbiased estimator* that normalizes each nonzero entry of the second moment by its observed frequency, followed by gradient descent to impute the missing entries of $T$. This normalization divides a weighted sum of $n$ binomial random variables by their total number of ones---a nonlinear operation. We show that the estimator is unbiased for any value of $p$ and enjoys low variance. When the row vectors of $M$ are drawn uniformly from a rank-$r$ factor model satisfying an incoherence condition, we prove that if $n \ge O({d r^5 \epsilon^{-2} C^{-2} \log d})$, any local minimum of the gradient-descent objective is approximately global and recovers $T$ with error at most $\epsilon^2$.
Experiments on both synthetic and real-world data validate our approach. On three MovieLens datasets, our algorithm reduces bias by $88$% relative to several baseline estimators. We also empirically evaluate the linear sampling complexity of $n$ relative to $d$ using synthetic data. Finally, on the Amazon reviews dataset with sparsity $10^{-7}$, our method reduces the recovery error of $T$ by $59$% and $M$ by $38$% compared to existing matrix completion methods.
Submission Number: 63
Loading