Keywords: dimensionality reduction, sparsity, johnson lindenstrauss
TL;DR: We study beyond worst-case dimensionality reduction for sparse vectors
Abstract: We study beyond worst-case dimensionality reduction for $s$-sparse vectors (vectors with at most $s$ non-zero coordinates). Our work is divided into two parts, each focusing on a different facet of beyond worst-case analysis:
\noindent (a) We first consider average-case guarantees for embedding $s$-sparse vectors. Here, a well-known folklore upper bound based on the birthday-paradox states: For any collection $X$ of $s$-sparse vectors in $\mathbb{R}^d$, there exists a linear map $A: \mathbb{R}^d \rightarrow \mathbb{R}^{O(s^2)}$ which \emph{exactly} preserves the norm of $99\%$ of the vectors in $X$ in any $\ell_p$ norm (as opposed to the usual setting where guarantees hold for all vectors). We provide novel lower bounds showing that this is indeed optimal in many settings. Specifically, any oblivious linear map satisfying similar average-case guarantees must map to $\Omega(s^2)$ dimensions. The same lower bound also holds for a wider class of sufficiently smooth maps, including `encoder-decoder schemes', where we compare the norm of the original vector to that of a smooth function of the embedding. These lower bounds reveal a surprising separation result for smooth embeddings of sparse vectors, as an upper bound of $O(s \log(d))$ is possible if we instead use arbitrary functions, e.g., via compressed sensing algorithms.
(b) Given these lower bounds, we specialize to sparse \emph{non-negative} vectors to hopes of improved upper bounds. For a dataset $X$ of non-negative $s$-sparse vectors and any $p \ge 1$, we can non-linearly embed $X$ to $O(s\log(|X|s)/\varepsilon^2)$ dimensions while preserving all pairwise distances in $\ell_p$ norm up to $1\pm \varepsilon$, with no dependence on $p$. Surprisingly, the non-negativity assumption enables much smaller embeddings than arbitrary sparse vectors, where the best known bound suffers an exponential $(\log |X|)^{O(p)}$ dependence. Our map also guarantees \emph{exact} dimensionality reduction for the $\ell_{\infty}$ norm by embedding $X$ into $O(s\log |X|)$ dimensions, which is tight. We further give separation results showing that both the non-linearity of $f$ and the non-negativity of $X$ are necessary, and provide downstream algorithmic improvements using our embedding.
Primary Area: other topics in machine learning (i.e., none of the above)
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 8411
Loading