Neural network compression using binarization and few full-precision weights

Published: 01 Jan 2025, Last Modified: 19 May 2025Inf. Sci. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Highlights•A network compression method (APB) that combines pruning and quantization.•Two matrix multiplication algorithms for extremely low-bit operands.•APB reduces the network size by one order of magnitude with no accuracy loss.
Loading