On the Efficiency of Deep Neural Networks

Yibin Liang; Yang Yi; Lingjia Liu

On the Efficiency of Deep Neural Networks

Yibin Liang, Yang Yi, Lingjia Liu

29 Sept 2021 (modified: 13 Feb 2023)ICLR 2022 Conference Withdrawn SubmissionReaders: Everyone

Keywords: Deep learning, Neural networks, Computation Efficiency, Weight pruning, Overfitting, Softmax, Log likelihood ratio (LLR)

Abstract: The efficiency of neural networks is very important in large-scale deployment scenarios such as mobile applications, internet of things, and edge computing. For given performance requirement, an efficient neural network should use the simplest network architecture with minimal number of parameters and connections. In this paper, we discuss several key issues and a new procedure for obtaining efficient networks that minimize total number of parameters and computation requirement. Our first contribution is identifying and analyzing several key components in training efficient networks with the backpropagation (BP) algorithm: 1) softmax normalization in output layers may be one major cause of parameter explosion; 2) using log likelihood ratio (LLR) representation in output layers can reduce overfitting; 3) weight decaying and structural regularization can effectively reduce overfitting when ReLU activation is used. The second contribution is discovering that a well-trained network without overfitting can be effectively pruned using a simple snapshot-based procedure -- after pruning unimportant weights and connections, simply adjust remaining non-weight parameters using the BP algorithm. The snapshot-based pruning method could also be used to evaluate and analyze the efficiency of neural networks. Finally, we hypothesize that there exist lower-bounds of total number of bits for representing parameters and connections with regard to performance metrics for a given optimization problem. Rather than focusing on improving the sole accuracy metric with more complex network architectures, we should also explore the trade-offs between accuracy and total number of representation bits when comparing different network architectures and implementations.

One-sentence Summary: Efficient deep neural networks can be obtained by reducing overfitting in training and a simple snapshot-based pruning procedure.

10 Replies

Loading