A Note on Reducing Computations in CNNs

Pallavi Ramicetty, Ritesh Dhananjay Nikose, Shravan Mohan, Milind Savagaonkar, Shubhashis Sengupta

Published: 01 Jan 2024, Last Modified: 19 Apr 2024COMSNETS 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: In this brief paper, a technique based on bilinear interpolation is presented to reduce computations in Convolutional Neural Networks (CNNs), while still maintaining comparable accuracies to the baseline. Using the proposed technique, the number of computations can be reduced by almost 50%. Although the number of parameters can be larger than the baseline model, standard pruning and quantization methods can help accommodate this increase. The basic idea is to replace overlapping strides with a linearly interpolated value obtained from non-overlapping strides. The weights defining the interpolation are allowed to be trainable, which helps achieve comparable accuracies to the baseline. Empirical results corroborate the presented approach.