Compact Deep Neural Networks with ℓ1, 1 and ℓ1, 2 Regularization

Rongrong Ma, Lingfeng Niu

2018 (modified: 02 Feb 2023)ICDM Workshops 2018Readers: Everyone

Abstract: Deep neural networks have demonstrated its superiority in many fields. Its excellent performance relys on quite a lot of parameters used in the network, resulting in a series of problems, including memory and computation requirement and overfitting, which seriously impede the application of deep neural networks in many assignments in practice. A considerable number of model compression methods have been proposed in deep neural networks to reduce the number of parameters used in networks, among which there is one kind of methods persuing sparsity in deep neural networks. In this paper, we propose to combine ℓ 1,1 and ℓ 1,2 norm together as the regularization term to regularize the objective function of the network. We introduce group and ℓ 1,1 can zero out weights in both intergroup and intra-group level. ℓ 1,2 regularizer can obtain intragroup level sparsity and cause even weights among groups. We adopt proximal gradient descent to solve the objective function regularized by our combined regularization. Experimental results demonstrate the effectiveness of the proposed regularizer when comparing it with other baseline regularizers.

0 Replies