CLUE: Fine-Grained Self-Supervised Learning with Multi-Level Regularization

17 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Self-Supervised Learning, Fine-Grained Learning
Abstract: Self-supervised learning (SSL) has achieved strong results on coarse-grained tasks but often struggles with fine-grained recognition, where categories differ only by subtle local cues. For strong downstream transfer, features must form compact within-class clusters with large inter-class margins at the fine level. However, standard SSL losses either over-separate visually similar subcategories by treating all non-positives as equally negative, or overlook part-based evidence and thus merge them under coarse prototypes. We propose a multi-level regularization framework that improves clustering across granularities. At the global level, a soft variant of InfoNCE reduces false negatives and enhances class separation. At the part level, clustering on local descriptors preserves subtle intra-class distinctions. At the instance level, semantic descriptions from vision–language models provide attribute-level anchors. Together, these components yield representations with balanced clustering across granularities. Experiments on CUB200-2011, Stanford Cars, and FGVC-Aircraft show consistent improvements in both classification and retrieval, validating the effectiveness of our approach for fine-grained SSL.
Supplementary Material: zip
Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Submission Number: 9223
Loading