Abstract: This paper presents Triply Compressed Sparse Column (TCSC), a novel compression technique designed specifically for matrix-vector operations where the matrix as well as the input and output vectors are sparse. We refer to these operations as SpMSpV2. TCSC compresses the nonzero columns and rows of a highly sparse matrix representing a large real-world graph. During this compression, it encodes the sparsity patterns of the input and output vectors within the compressed representation of the sparse matrix itself. Consequently, it aligns the compressed indices of the input and output vectors with those of the compressed matrix columns and rows, thus eliminating the need for extra indirections when SpMSpV2 operations access the vectors. This results in fewer cache misses, greater space efficiency and faster execution times. We evaluate TCSC's performance and show that it is more space and time efficient compared to CSC and DCSC, with up to 11× speedup. We integrate TCSC into GraphTap, our suggested linear algebra-based distributed graph analytics system. We compare GraphTap against GraphPad and LA3, two state-of-the-art linear algebra-based distributed graph analytics systems, using different dataset scales and numbers of processes. GraphTap is up to 7× faster than these systems due to TCSC and the resulting communication efficiency.
External IDs:dblp:conf/cluster/Hasanzadeh-Mofrad19
Loading