Optimal Selection of Matrix Shape and Decomposition Scheme for Neural Network Compression
Abstract: When applying the low-rank decomposition to neural networks,tensor-shaped weights need to be reshaped into a matrix first. While many matrix reshapes are possible, some of them induce a low-rank decomposition scheme that can be more efficiently implemented as a sequence of layers. This poses the following problem: how should one select both the matrix reshape and associated low-rank decomposition scheme in order to compress a neural network so that its implementation is as efficient as possible? We formulate this problem as a mixed-integer optimization over the weights, ranks,and decompositions schemes; and we provide an efficient alternating optimization algorithm involving two simple steps: a step over the weights of the neural network (solved by SGD), and a step over the ranks and decomposition schemes (solved by an SVD).Our algorithm automatically selects the most suitable ranks and decomposition schemes to efficiently reduce compression costs (e.g.,FLOPs) of various networks.
0 Replies
Loading