Cross-Instance Contrastive Masking in Vision Transformers for Self-Supervised Hyperspectral Image Classification

Published: 06 Mar 2025, Last Modified: 06 Mar 2025SCSL @ ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0
Track: tiny / short paper (up to 3 pages)
Keywords: Cross-Instance Contrastive Masking, Vision Transformer, Hyperspectral Image, Self-supervision
TL;DR: The article introduces Cross-Instance Contrastive Masking, a method for hyperspectral image classification that improves feature extraction and reduces shortcut learning through dynamic contrastive masking.
Abstract: This article presents a novel Cross-Instance Contrastive Masking-Enhanced Vision Transformer (CICM-ViT) for hyperspectral image (HSI) classification, which attempts to reduce shortcut learning through Cross-Instance Contrastive Masking (CICM) to enhance spectral-spatial feature extraction through self-supervision. Using the dependencies between instances, CICM-ViT dynamically masks spectral patches across instances, promoting the learning of discriminative features while reducing redundancy, especially in low-data settings. This approach reduces shortcut learning by focusing on global patterns rather than relying on local spurious correlations. CICM-ViT achieves state-of-the-art performance on HSI datasets, with 99.91% OA on Salinas, 96.88% OA on Indian Pines, and 98.88% OA on Botswana, outperforming fourteen SOTA CNN- and transformer-based approaches in both accuracy and efficiency, with only 89,680 parameters.
Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.
Format: Yes, the presenting author will attend in person if this work is accepted to the workshop.
Funding: No, the presenting author of this submission does *not* fall under ICLR’s funding aims, or has sufficient alternate funding.
Presenter: ~Abhiroop_Chatterjee1
Submission Number: 61
Loading