Keywords: Fine-grained representation learning, Re-identification, Pooling operation
Abstract: Fine-grained recognition aims to discriminate the sub-categories of the images within one general category. It is fundamentally difficult due to the requirement to extract fine-grained features from subtle regions. Nonetheless, a Convolutional Neural Network typically applies strided operations to downsample the representation, which would excessively spoil the feature resolution and lead to a significant loss of fine-grained information. In this paper, we propose Adaptive Region Pooling (ARP): a novel downsampling algorithm that makes the network only focus on a smaller but more critical region, and simultaneously increase the resolution of sub-sampled feature. ARP owns a trade-off mechanism that allows users to actively balance the scale of receptive field and the granularity of feature. Also, without any learning-based parameters, ARP provides the network a stabler training process and an earlier convergence. Extensive experiments qualitatively and quantitatively validate the effectiveness and efficiency of the proposed pooling operation and show superior performance against the state-of-the-arts in both the tasks of image classification and image retrieval.
One-sentence Summary: This paper introduces a novel pooling operation which would automatically focus on a smaller but more critical region and enrich the granularity of the sub-sampled representations simultaneously.
Supplementary Material: zip
6 Replies
Loading