Abstract: A major challenge in Fine-Grained Visual Classification (FGVC) is distinguishing various categories with high inter-class similarity by learning the feature that differentiates the details. Conventional cross-entropy trained Convolutional Neural Network (CNN) fails this challenge as they may suffer from producing inter-class invariant features in FGVC. In this work, we innovatively propose to regularize the training of CNN by enforcing the uniqueness of the features of each category from an information-theoretic perspective. To achieve this goal, we formulate a minimax loss based on a game-theoretic framework, where a Nash equilibrium is proved to be consistent with this regularization objective. Besides, to avoid getting a solution that produces redundant features, we present a Feature Redundancy Loss (FRL) based on the normalized inner product between each selected feature map pair to complement the proposed minimax loss. The proposed method is versatile, as it can be utilized as a regularizer for features in the mid-level or the penultimate layer, and can be combined with any architectures. Extensive experimental results on several influential benchmarks along with visualization show that our method obtains significant improvement over the baseline model without extra cost and achieves state-of-the-art results.
External IDs:dblp:conf/icassp/ZhengLYZCD25
Loading