Spatial-aware collaborative region mining for fine-grained recognition

Published: 2024, Last Modified: 27 Jan 2025Multim. Tools Appl. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Fine-grained recognition aims to classify images into hundreds of subcategorical labels under a generic category. The main challenge lies in the similar appearance between sub-categories and pushes a model to explore the discriminative regions automatically. Most existing approaches either only mine the informative regions without considering the interclass relationship or focus on pairwise images but neglect the multiple-class relationship, which leads to incomplete information and the tendency to focus on a single region. Since the interclass correlations and the discriminative regions both play an important role in distinguishing one fine-grained category from others, we propose a new Spatial-aware Collaborative Region Mining (SCRIM) scheme by fully exploiting the relationships between inter- and intraclass regions. The proposed SCRIM scheme consists of two modules that collaboratively mine the spatially aware feature: the Coarse Parts Localization (CPL) module that exploits the hierarchical inter- and intraclass correlations; and the Fine Parts Localization (FPL) module, which mines the multi-scale fine discriminative parts. Specifically, dual CPLs create two groups of contrastive part features separately by extracting contrastive features for each image. These features from the same class and module should have smaller distances. Given the extracted features, dual FPLs further mine and updates the fine region features by ranking their informativeness scores with ground truth subcategorical labels. Through the collaboration between the CPL and FPL, our SCRIM scheme can take the hierarchical correlations between multiple samples into account and mine the multi-scale discriminative parts for final fine-grained classification. Extensive experiments on three popular benchmarks show that our proposed SCRIM outperforms the state-of-the-art methods by a large margin.
Loading