Accelerating Matrix Multiplication in Deep Learning by Using Low-Rank ApproximationDownload PDFOpen Website

2017 (modified: 01 Nov 2022)HPCS 2017Readers: Everyone
Abstract: The open source frameworks of deep learning including TensorFlow, Caffe, Torch, etc. are widely used all over the world and its acceleration have great meaning. In these frameworks, a lot of computation time is spent on convolution, and highly tuned libraries such as cuDNN play important role on accelerating convolution. In these libraries, however, a convolution computation is performed without approximating a dense matrices. In this research, we propose a method to introduce the low-rank approximation method, widely used in the field of scientific and technical computation, into the convolution computation. As a result of investigating the influence on the recognition accuracy of the existing model, it is possible to reduce up to about 90% of rank of data matrices while keeping recognition accuracy -2% of baseline.
0 Replies

Loading