- Keywords: global pooling, fine-grained recognition, benchmark
- TL;DR: A benchmark of nine representative global pooling schemes reveals some interesting findings.
- Abstract: Global feature pooling is a modern variant of feature pooling providing better interpretatability and regularization. Although alternative pooling methods exist (eg. max, lp norm, stochastic), the averaging operation is still the dominating global pooling scheme in popular models. As fine-grained recognition requires learning subtle, discriminative features, we consider the question: is average pooling the optimal strategy? We first ask: ``is there a difference between features learned by global average and max pooling?'' Visualization and quantitative analysis show that max pooling encourages learning features of different spatial scales. We then ask ``is there a single global feature pooling variant that's most suitable for fine-grained recognition?'' A thorough evaluation of nine representative pooling algorithms finds that: max pooling outperforms average pooling consistently across models, datasets, and image resolutions; it does so by reducing the generalization gap; and generalized pooling's performance increases almost monotonically as it changes from average to max. We finally ask: ``what's the best way to combine two heterogeneous pooling schemes?'' Common strategies struggle because of potential gradient conflict but the ``freeze-and-train'' trick works best. We also find that post-global batch normalization helps with faster convergence and improves model performance consistently.