Keywords: low-rank matrix recovery, optimal transport, min-max optimization, permutation matrix
TL;DR: We formulate a new matrix recovery problem for addressing a common obstacle in utilizing heterogeneous data, and develop an efficient algorithm to solve it.
Abstract: We study a matrix recovery problem with unknown correspondence: given the observation matrix $M_o=[A,\tilde P B]$, where $\tilde P$ is an unknown permutation matrix, we aim to recover the underlying matrix $M=[A,B]$. Such problem commonly arises in many applications where heterogeneous data are utilized and the correspondence among them are unknown, e.g., due to data mishandling or privacy concern. We show that, in some applications, it is possible to recover $M$ via solving a nuclear norm minimization problem. Moreover, under a proper low-rank condition on $M$, we derive a non-asymptotic error bound for the recovery of $M$. We propose an algorithm, $\text{M}^3\text{O}$ (Matrix recovery via Min-Max Optimization) which recasts this combinatorial problem as a continuous minimax optimization problem and solves it by proximal gradient with a Max-Oracle. $\text{M}^3\text{O}$ can also be applied to a more general scenario where we have missing entries in $M_o$ and multiple groups of data with distinct unknown correspondence. Experiments on simulated data, the MovieLens 100K dataset and Yale B database show that $\text{M}^3\text{O}$ achieves state-of-the-art performance over several baselines and can recover the ground-truth correspondence with high accuracy.
Supplementary Material: pdf
Other Supplementary Material: zip
0 Replies
Loading