Abstract: In this paper, we propose a method to reduce the memory requirement of Convolutional Neural Networks (CNN) in the inference phase. Before feeding an input image into the CNN model, input image will split evenly to several sub-images and feed them into models respectively and combine the output after a certain layer.
0 Replies
Loading