everyone
since 04 Oct 2024">EveryoneRevisionsBibTeXCC BY 4.0
Existing 3D occupancy networks demand significant hardware resources, hindering the deployment of edge devices. Binarized Neural Networks (BNNs) offer a potential solution by substantially reducing computational and memory requirements. However, their performances decrease notably compared to full-precision networks. In addition, it is challenging to enhance the performance of the binarized model by increasing the number of binarized convolutional layers, which limits its practicability for 3D occupancy prediction. This paper presents two original insights into binarized convolution, substantiated with theoretical proofs: (a) $1\times1$ binarized convolution introduces minimal binarization errors as the network deepens, and (b) binarized convolution is inferior to full-precision convolution in capturing cross-channel feature importance. Building on the above insights, we propose a novel binarized deep convolution (BDC) unit that significantly enhances performance, even when the number of binarized convolutional layers increases. Specifically, in the BDC unit, additional binarized convolutional kernels are constrained to $1\times1$ to minimize the effects of binarization errors. Further, we propose a per-channel refinement branch to reweight the output via first-order approximation. Then, we partition the 3D occupancy networks into four convolutional modules, using the proposed BDC unit to binarize them. The proposed BDC unit minimizes binarization errors and improves perceptual capability while significantly boosting computational efficiency, meeting the stringent requirements for accuracy and speed in occupancy prediction. Extensive quantitative and qualitative experiments validate that the proposed BDC unit supports state-of-the-art precision in occupancy prediction and object detection tasks with substantially reduced parameters and operations. Code is provided in the supplementary material and will be open-sourced upon review.