S$^3$: Sign-Sparse-Shift Reparametrization for Effective Training of Low-bit Shift Networks

Xinlin Li; Bang Liu; Yaoliang Yu; Wulong Liu; Chunjing Xu; Vahid Partovi Nia

S$^3$: Sign-Sparse-Shift Reparametrization for Effective Training of Low-bit Shift Networks

Xinlin Li, Bang Liu, Yaoliang Yu, Wulong Liu, Chunjing Xu, Vahid Partovi Nia

Published: 09 Nov 2021, Last Modified: 05 May 2023NeurIPS 2021 PosterReaders: Everyone

Keywords: Convolution Neural Network, CNN, Edge Computing, Edge, Quantization, Low-bit, Multiplication-free, Bit-shift

TL;DR: S$^3$ re-parameterization for efficient training of low-bit shift networks combating vanishing gradient problem and weight sign freezing problem, 3-bit multiplication-free network compete full-precision model on ImageNet.

Abstract: Shift neural networks reduce computation complexity by removing expensive multiplication operations and quantizing continuous weights into low-bit discrete values, which are fast and energy-efficient compared to conventional neural networks. However, existing shift networks are sensitive to the weight initialization and yield a degraded performance caused by vanishing gradient and weight sign freezing problem. To address these issues, we propose S$^3$ re-parameterization, a novel technique for training low-bit shift networks. Our method decomposes a discrete parameter in a sign-sparse-shift 3-fold manner. This way, it efficiently learns a low-bit network with weight dynamics similar to full-precision networks and insensitive to weight initialization. Our proposed training method pushes the boundaries of shift neural networks and shows 3-bit shift networks compete with their full-precision counterparts in terms of top-1 accuracy on ImageNet.

Code Of Conduct: I certify that all co-authors of this work have read and commit to adhering to the NeurIPS Statement on Ethics, Fairness, Inclusivity, and Code of Conduct.

Supplementary Material: pdf

21 Replies

Loading