Abstract: Tensor provides a brief and natural representation for large-scale multidimensional data by way of appropriate low-rank approximations, thus we can discover significant latent structures of complex data and generalize data representation. To date, tensor has gained tremendous success in various science and technology fields, especially in machine learning and big data applications. However, tensor computation, especially tensor decomposition, is usually expensive due to the inherent large-size characteristic of tensors, and hence would potentially hinder their future wide deployment. In this paper, we develop a hardware architecture to accelerate tensor singular value decomposition (t-SVD), which is a new tensor decomposition technique that has been successfully applied to high-dimensional data classification and video recovery. Specifically, design consideration of each key computing unit is analyzed and discussed. Then, the proposed t-SVD hardware architecture is implemented and synthesized using CMOS 28nm technology. Comparison with real-world CPU-based implementations shows that the proposed hardware accelerator is expected to provide average 14× speedup on various t-SVD workloads.
0 Replies
Loading