Abstract: To reduce computational redundancy inherent in fixed-bit-width quantization, input-adaptive quantization dynamically adjusts the bit-width of network parameters based on the difficulty of the given input. However, estimating image difficulty for object detection is a non-trivial task, as multiple detection results may occur within a single image. In this paper, we propose an input-adaptive mixed-precision framework that automatically adjusts the bit-width of each layer in the target model based on the characteristics of an input image. For searching optimal bit configurations, the framework employs a reward function that considers both the difficulty of a single image and the computational cost. Experimental results demonstrate that the proposed method outperforms prior quantization methods with fixed bit-widths.
External IDs:dblp:conf/iscas/JeongJKJL25
Loading