Abstract: Highlights•A network compression method (APB) that combines pruning and quantization.•Two matrix multiplication algorithms for extremely low-bit operands.•APB reduces the network size by one order of magnitude with no accuracy loss.
Loading