Abstract: Deep learning models have become increasingly prevalent in various domains, necessitating their deployment on resource-constrained devices. Quantization is a promising way to reduce the model complexity in that it keeps model architecture intact and enables the model to operate on specialized hardwares(e.g., NPU, DSP). Input resolution is also essential in making a trade-off between accuracy and computation.In this paper, we conduct a joint analysis of input resolution and quantization precision on their influence on accuracy for three popular models: ResNet-18, ResNet-50, and MobileNet-V2. By exploring the combined configuration space, we found that better accuracy can be achieved by jointly optimizing the input resolution and quantization bit-width while maintaining the computational complexity.
Loading