Training Deep Neural Networks with Joint Quantization and Pruning of Features and Weights

Xinyu Zhang; Ian Colbert; ‪Ken Kreutz-Delgado‬; Srinjoy Das

Training Deep Neural Networks with Joint Quantization and Pruning of Features and Weights

Xinyu Zhang, Ian Colbert, ‪Ken Kreutz-Delgado‬, Srinjoy Das

29 Sept 2021 (modified: 15 Jun 2025)ICLR 2022 Conference Withdrawn SubmissionReaders: Everyone

Keywords: Deep Learning, Quantization, Pruning, Feature Sparsity, Model Compression, Unstructured Feature Pruning

Abstract: Quantization and pruning are widely used to reduce the inference costs of deep neural networks. In this work, we propose a framework to train deep neural networks using novel methods for uniform quantization and unstructured pruning on both the features and weights. We demonstrate that our method delivers an increased performance per memory footprint over existing state-of-the-art solutions. Using our framework, we empirically evaluate the prune-then-quantize paradigm and independence assumption across a wide range of computer vision tasks and observe the non-commutativity of quantization and pruning when applied to both features and weights.

One-sentence Summary: We propose a framework to train deep neural networks using novel methods for uniform quantization and unstructured pruning on both the features and weights.

Supplementary Material: zip

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/training-deep-neural-networks-with-joint/code)

1 Reply

Loading