Dispense Mode for Inference to Accelerate Branchynet

Published: 01 Jan 2022, Last Modified: 17 May 2025ICIP 2022EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: With the increase of depth and width, Deep Neural Network has got the best results in the computer vision, but its massive calculation has brought a heavy burden to IOT devices. To speed up the inference of DNN models, Branchynet creatively puts forward the early exit, which means that samples exit from shallow layers to reduce the calculation amount of the model. But Branchynet has some unnecessary intermediate calculations in the inference process. We propose a dispense mode to solve this problem, which can optimize the accuracy and latency of BranchyNet at the same time. The dispense mode directly determines the exit position of the sample in the multi-branch network according to the difficulty of the sample without intermediate trial errors. Under the same accuracy requirements, the inference speed is improved by 30%-50% compared with the cascade mode of Branchynet. Moreover, while further reducing redundant calculation, it provides a method for dynamic adjustment of accuracy. Thus, our framework can easily adjust the accuracy of the model to meet higher throughputs.
Loading