The Sparse Matrix-Based Random Projection: A Study of Binary and Ternary Quantization

TMLR Paper3480 Authors

12 Oct 2024 (modified: 20 Oct 2024)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Random projection is a straightforward yet effective dimension reduction technique, widely used in various classification tasks. Following the projection process, quantization is often applied to further simplify the projected data. Typically, quantized projections are required to approximately preserve the pairwise distance between original data points, to avoid significant performance degradation in classification tasks. To date, this distance preservation property has been investigated for the commonly-used Gaussian matrix. In the paper, we further explore this property for the hardware-friendly $\{0,1\}$-binary matrix, specifically when the projections undergo element-wise quantization into two types of low bit-width codes: $\{0,1\}$-binary codes and $\{0,\pm1\}$-ternary codes. It is found that the distance preservation property tends to be better maintained, when the binary projection matrix exhibits sparse structures. This property is corroborated by classification experiments, where very sparse binary matrices, with only one nonzero entry per column, demonstrate better or comparable classification performance compared to other more dense binary matrices and Gaussian matrices. This presents an opportunity to significantly reduce the computational and storage complexity of the quantized random projection model, without compromising and potentially even improving its classification performance.
Submission Length: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Jeff_Phillips1
Submission Number: 3480
Loading