On Sparsity and Sub-Gaussianity in the Johnson- Lindenstrauss Lemma

TMLR Paper4587 Authors

31 Mar 2025 (modified: 23 Sept 2025)Decision pending for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: We provide a simple proof of the Johnson-Lindenstrauss lemma for sub-Gaussian variables. We extend the analysis to identify how sparse projections can be, and what the cost of sparsity is on the target dimension. The Johnson-Lindenstrauss lemma is the theoretical core of the dimensionality reduction methods based on random projections. While its original formulation involves matrices with Gaussian entries, the computational cost of random projections can be drastically reduced by the use of simpler variables, especially if they vanish with a high probability. In this paper, we propose a simple and elementary analysis of random projections under classical assumptions that emphasizes the key role of sub-Gaussianity. Furthermore, we show how to extend it to sparse projections, emphasizing the limits induced by the sparsity of the data itself.
Submission Length: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: We have corrected the typos pointed out by the reviews. We modified the introduction so as to clarify the list of contributions and purpose of the paper. In particular, we have modified and hopefully improved the presentation of the content of Section 4, reformulated a few references, included a comparison with Li's paper etc. We have also added on the mostly theoretical scope of Theorem 2. We have added in the early parts of this revisions a few elements on the approaches of the state of the art (e.g. moment arguments for Archiloptas) and compared with our approach.
Assigned Action Editor: ~Jean_Barbier2
Submission Number: 4587
Loading