Partial Binarization of Neural Networks for Budget Aware Efficient Learning
Abstract: Binarization is a powerful compression technique for
neural networks, significantly reducing FLOPs, but often
results in a significant drop in model performance. To address this issue, partial binarization techniques have been
developed, but a systematic approach to mixing binary and
full-precision parameters in a single network is still lacking. In this paper, we propose a controlled approach to
partial binarization, creating a budgeted binary neural network (B2NN) with our MixBin strategy. This method optimizes the mixing of binary and full-precision components,
allowing for explicit selection of the fraction of the network
to remain binary. Our experiments show that B2NNs created using MixBin outperform those from random or iterative searches and state-of-the-art layer selection methods
by up to 3% on the ImageNet-1K dataset. We also show that
B2NNs outperform the structured pruning baseline by approximately 23% at the extreme FLOP budget of 15%, and
perform well in object tracking, with up to a 12.4% relative
improvement over other baselines. Additionally, we demonstrate that B2NNs developed by MixBin can be transferred
across datasets, with some cases showing improved performance over directly applying MixBin on the downstream
data
Loading