A geometrical connection between sparse and low-rank matrices and its application to manifold learning
Many changes were made to incorporate suggestions from the anonymous reviewers. In the main manuscript, there is more discussion of non-convexity and convergence, and in a supplementary appendix, there are new results from a nonlinear (versus linear) baseline for dimensionality reduction. A number of other minor changes were also made throughout the paper.
The following is a list of specific changes; page numbers refer to the current manuscript.
page 1: The abstract indicates that the sparse matrix is recovered specifically via an elementwise nonlinearity; it also states earlier that the embedding is norm-preserving. The first sentence of the introduction has been reworded. The word "geometrical" has been added to the title.
page 2: The second paragraph of the introduction gives another example of rank reduction (when $S$ is the identity matrix).
page 4: Further explanation of the model has been added to the paragraph after equation 2.
page 5: The last paragraph of section 2.1 clarifies that many inputs are regarded as similar even when they are not one-nearest neighbors. The caption to Figure 2 clarifies that the blue/orange histograms are separately normalized and have small but nonzero overlap. A footnote has been added on the handling of outliers in graph-based methods.
page 6: The top paragraph emphasizes that the optimization is non-convex and discusses how one tests empirically for convergence.
page 8: The typo after eq. (10) has been fixed (describing the least-squares sense in which linear projections from SVD are optimal). The conflicting notation between eqs. (11-12) has been corrected. A footnote directs the reader to Appendix B, which contains results from a modified implementation of the Isomap algorithm. Another footnote mentions the need to safeguard against stereotypes and implicit biases when word-vectors are used for real-world applications.
page 12: An acknowledgements section has been added.
page 17: Appendix A.1 contains more discussion on initializing non-convex optimizations for nonlinear dimensionality reduction.
page 18: Figure 8 (new) plots the convergence of the objective function for the alternating minimization.
pages 18-21: A new supplementary section (appendix B) presents results from a modified implementation of Isomap based on cosine distances. This section also includes three new figures to present and interpret these results.