Open Peer Review. Open Publishing. Open Access. Open Discussion. Open Directory. Open Recommendations. Open API. Open Source.
Variational Network Quantization
Nov 03, 2017 (modified: Nov 03, 2017)ICLR 2018 Conference Blind Submissionreaders: everyoneShow Bibtex
Abstract:We formulate the preparation of a neural network for pruning and few-bit quantization as a variational inference problem. We introduce a quantizing prior that leads to a multi-modal, sparse posterior distribution over weights and further derive a differentiable KL approximation for this prior. After training with Variational Network Quantization (VNQ), weights can be replaced by deterministic quantization values with small to negligible loss of task accuracy (including pruning by setting weights to 0). Our method does not require fine-tuning after quantization. We show results for ternary quantization on LeNet-5 (MNIST) and DenseNet-121 (CIFAR-10).
TL;DR:We quantize and prune neural network weights using variational Bayesian inference with a multi-modal, sparsity inducing prior.