OneBNet: Binarized Neural Networks using Decomposed 1-D Binarized Convolutions on Edge Device

22 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: representation learning for computer vision, audio, language, and other modalities
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: Binarized Neural Networks, Computer Vision, Inference, 1-D convolution
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
TL;DR: Binarized Neural Networks using Decomposed 1-D Binarized Convolutions
Abstract: Nowadays, it is natural to use 2-D convolutions in convolutional neural networks (CNNs) for computer vision, but this paper shows that 1-D binarized convolutions can achieve excellent performance on CPU-based edge devices. This paper proposes a new structure called OneBNet to maximize the effects of 1-D binarized convolutions. The proposed 1-D downsampling can perform information compression gradually through two 1-D convolutions, which can contribute tremendously to the performance improvement in binarized convolutional neural networks (BCNNs). Compared with 2-D binarized convolutions, a $n \times n$ 2-D binarized convolution is replaced by $n \times 1$ row-wise and $1 \times n$ column-wise 1-D binarized convolutions, thus doubling the effects of adjusting the activation distribution and non-linear activation function. In the decomposed 1-D binarized convolution, although computational costs are reduced, the number of element-wise activation functions and learnable bias layers can be doubled, which can be a significant burden. Therefore, we expect that the 1-D binarized convolution is not suitable for all layers, and we present the reason and experimental results proving it. Based on the above assumption and experimental results, we can provide more optimized structure in terms of performance and costs. With ResNet as a backbone, we evaluate the proposed model on several conventional image datasets. In experiments, the proposed model based on ResNet18 achieves 93.4\% and 93.6\% Top-1 accuracy on the FashionMNIST and CIFAR10 datasets. In the case of training from scratch, the proposed OneBNet based on ResNet18 can produce 63.9\% Top-1 accuracy, showing better performance over the state-of-the-art (SOTA) binarized CNNs based on ResNet18. When applying the teacher-student training, 68.4\% Top-1 accuracy can be obtained, which overwhelms the existing SOTA BCNNs. With 5\% additional delay on a single thread of Raspberry Pi, the proposed lightweight model achieves 67.3\% Top-1 accuracy on the ImageNet dataset, outperforming the baseline by 1.8\%.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
Supplementary Material: zip
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 4273
Loading