Compressing Deep Neural Networks With Learnable RegularizationDownload PDF

25 Sept 2019 (modified: 05 May 2023)ICLR 2020 Conference Withdrawn SubmissionReaders: Everyone
Abstract: We consider learning and compressing deep neural networks (DNNs) that consist of low-precision weights and activations for efficient inference of fixed-point operations. In training low-precision DNNs, gradient descent in the backward pass is performed with high-precision weights while quantized low-precision weights and activations are used in the forward pass for computing the loss function. Thus, the gradient descent becomes suboptimal, and accuracy loss follows. In order to reduce the mismatch in the forward and backward passes, we utilize mean squared quantization error (MSQE) regularization. In particular, we propose using a learnable regularization coefficient with the MSQE regularizer to reinforce the convergence of high-precision weights to their quantized values. Furthermore, we investigate how partial L2 regularization can be employed for weight pruning in a similar manner. Finally, combining weight pruning, quantization, and entropy coding, we establish a low-precision DNN compression pipeline. In our experiments, the proposed method produces low-precision MobileNet and ShuffleNet models on ImageNet classification with the state-of-the-art compression ratios. Moreover, we examine our method for image super resolution DNNs to produce low-precision models at negligible performance loss.
Original Pdf: pdf
6 Replies

Loading