Abstract: Mixed-precision quantization with adaptive bitwidth allocation for neural network has achieved higher compression rate and accuracy in classification task. However, it has not been well explored for object detection networks. In this paper, we propose a novel mixed-precision quantization scheme with dynamical Hessian matrix for object detection networks. We iteratively select a layer with the lowest sensitivity based on the Hessian matrix and downgrade its precision to reach the required compression ratio. The L-BFGS algorithm is utilized for updating the Hessian matrix in each quantization iteration. Moreover, we specifically design the loss function for objection detection networks by jointly considering the quantization effects on classification and regression loss. Experimental results on RetinaNet and Faster R-CNN show that the proposed DHMQ achieves state-of-the-art performance for quantized object detec-tors.
0 Replies
Loading