Extremely Sparse Johnson-Lindenstrauss Transform: From Theory to Algorithm

Rong Yin, Yong Liu, Weiping Wang, Dan Meng

2020 (modified: 16 May 2022)ICDM 2020Readers: Everyone

Abstract: Dimension reduction is a fundamental data mining task. However, it has limited applicability in high-dimensional scenarios because of stringent computational requirements. To address these issues, we propose ESE, an extremely sparse Johnson-Lindenstrauss transform, which takes a substantial step in dimension reduction. The projection matrix of ESE is an extremely sparse matrix, which has only k nonzero elements by employing the hash functions, where k is the embedded dimension. Theoretical analysis shows that ESE has a smaller time complexity than the existing projection algorithms and keeps the best accuracy (1+ε) for the general case, where 0 <; ε ≪ 1. In particular, the optimal statistical accuracy is achieved requiring log(n)log(d)/ε embedded dimension, where n is the number of data, d is the dimension of data. The extensive experiments verify that ESE has a significant advantage in time with satisfactory accuracy, compared to the state-of-the-art dimension reduction algorithms.

0 Replies