DNN Model Compression Under Accuracy Constraints

Soroosh Khoram, Jing Li

Feb 15, 2018 (modified: Feb 15, 2018) ICLR 2018 Conference Blind Submission readers: everyone Show Bibtex
  • Abstract: The growing interest to implement Deep Neural Networks (DNNs) on resource-bound hardware has motivated innovation of compression algorithms. Using these algorithms, DNN model sizes can be substantially reduced, with little to no accuracy degradation. This is achieved by either eliminating components from the model, or penalizing complexity during training. While both approaches demonstrate considerable compressions, the former often ignores the loss function during compression while the later produces unpredictable compressions. In this paper, we propose a technique that directly minimizes both the model complexity and the changes in the loss function. In this technique, we formulate compression as a constrained optimization problem, and then present a solution for it. We will show that using this technique, we can achieve competitive results.
  • TL;DR: Compressing trained DNN models by minimizing their complexity while constraining their loss.
  • Keywords: DNN Compression, Weigh-sharing, Model Compression