Efficient Computation of Quantized Neural Networks by {−1, +1} Encoding DecompositionDownload PDF

08 Oct 2018 (modified: 05 May 2023)NIPS 2018 Workshop CDNNRIA Blind SubmissionReaders: Everyone
Abstract: Deep neural networks require extensive computing resources, and can not be efficiently applied to embedded devices such as mobile phones, which seriously limits their applicability. To address this problem, we propose a novel encoding scheme by using {-1,+1} to decompose quantized neural networks (QNNs) into multi-branch binary networks, which can be efficiently implemented by bitwise operations (xnor and bitcount) to achieve model compression, computational acceleration and resource saving. Our method can achieve at most ~59 speedup and ~32 memory saving over its full-precision counterparts. Therefore, users can easily achieve different encoding precisions arbitrarily according to their requirements and hardware resources. Our mechanism is very suitable for the use of FPGA and ASIC in terms of data storage and computation, which provides a feasible idea for smart chips. We validate the effectiveness of our method on both large-scale image classification (e.g., ImageNet) and object detection tasks.
Keywords: quantized neural networks, {−1, +1} encoding decomposition, bitwise operations, smart chips
TL;DR: A novel encoding scheme of using {-1, +1} to decompose QNNs into multi-branch binary networks, in which we used bitwise operations (xnor and bitcount) to achieve model compression, computational acceleration and resource saving.
7 Replies

Loading