Weight fixing networks

Christopher Subia-Waud, Srinandan Dasmahapatra

Published: 03 Nov 2022, Last Modified: 05 Nov 2025Computer Vision – ECCV 2022 - 17th European Conference, 2022, ProceedingsEveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Modern iterations of deep learning models contain millions (billions) of unique parameters-each represented by a b-bit number. Popular attempts at compressing neural networks (such as pruning and quan-tisation) have shown that many of the parameters are superfluous, which we can remove (pruning) or express with b ′ < b bits (quantisation) without hindering performance. Here we look to go much further in minimis-ing the information content of networks. Rather than a channel or layer-wise encoding, we look to lossless whole-network quantisation to min-imise the entropy and number of unique parameters in a network. We propose a new method, which we call Weight Fixing Networks (WFN) that we design to realise four model outcome objectives: i) very few unique weights, ii) low-entropy weight encodings, iii) unique weight values which are amenable to energy-saving versions of hardware multiplication, and iv) lossless task-performance. Some of these goals are conflicting. To best balance these conflicts, we combine a few novel (and some well-trodden) tricks; a novel regularisation term, (i, ii) a view of clustering cost as relative distance change (i, ii, iv), and a focus on whole-network re-use of weights (i, iii). Our Imagenet experiments demonstrate lossless compression using 56x fewer unique weights and a 1.9x lower weight-space entropy than SOTA quantisation approaches. Code and model saves can be found at github.com/subiawaud/Weight Fix Networks.
Loading