Abstract: Highlights • A bypass structure, Pooling Block, is proposed to fuse feature from different layers. • It’s easy to add Pooling Block to lightweight networks with a few extra parameters. • Experiments are conducted on 7 widely used datasets to validate Pooling Block. • Experimental results show improvements with Pooling Block for many CV tasks. • Pooling Block can accelerates the speed of convergence for image classification. Abstract Since lightweight neural networks such as MobileNet, SqueezeNet were designed, it has been possible to run deep neural networks on mobile devices. However, the performance of these lightweight networks is not as good as that of the general deeper networks, especially for image classification and object detection. To solve this issue, we design a general subnetwork, called Pooling Block, to extract the information from input, and fuse it with the information from the original backbone network. Our study shows that the main backbone network combined with the Pooling Block gains better performance with a few extra parameters and computation cost for the lightweight networks. Meanwhile, we find that the Pooling Block can evidently accelerate the speed of convergence during the training phase. We adopt Pooling Block into MobileNet V2 to conduct the image classification, object detection and pedestrian detection experiments on many widely used datasets, including Animal with Attributes 2, Caltech-101, Caltech-256, PASCAL VOC 2007, MS COCO 2017, Caltech Pedestrian and Citypersons. Experimental results show that adding Pooling Block leads to 2% improvement on image classification datasets, and more than 0.5% improvement on object detection datasets, with little or no loss of computational speed.
0 Replies
Loading