Input-Aware Sparse Tensor Storage Format Selection for Optimizing MTTKRP

Qingxiao Sun, Yi Liu, Hailong Yang, Ming Dun, Zhongzhi Luan, Lin Gan, Guangwen Yang, Depei Qian

Published: 01 Jan 2022, Last Modified: 10 May 2023IEEE Trans. Computers 2022Readers: Everyone

Abstract: Canonical polyadic decomposition (CPD) is one of the most common tensor computations adopted in many scientific applications. The major bottleneck of CPD is matricized tensor times Khatri-Rao product (MTTKRP). To optimize the performance of MTTKRP, various sparse tensor formats have been proposed such as CSF and HiCOO. However, due to the spatial complexity of the tensors, no single format fits all tensors. To address this problem, we propose <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">SpTFS , a framework that automatically predicts the optimal storage format for an input sparse tensor. Specifically, <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">SpTFS leverages a set of sampling methods to lower the sparse tensor to fixed-sized matrices and sparsity features. In addition, <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">SpTFS adopts both supervised learning based and unsupervised learning based methods to predict the optimal sparse tensor storage formats. For supervised learning, we propose <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">TnsNet that combines convolution neural network (CNN) and the feature layer, which effectively captures the sparsity patterns of the input tensors. Once trained, the <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">TnsNet can be used with either density or histogram representation of the input tensor for optimal format prediction. Whereas for unsupervised learning, we propose <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">TnsClustering that consists of a feature encoder using convolutional layers and fully connected layers, and a <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">K-means++ model to cluster sparse tensors for optimal tensor format prediction, without massively profiling on the hardware platform. <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">SpTFS can use the above two models to predict the optimal tensor storage format for accelerating MTTKRP accurately. The experimental results show that both <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">TnsNet and <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">TnsClustering can achieve higher prediction accuracy and performance speedup compared to the state-of-the-art works.

0 Replies