TL;DR: Improving classification performance for imbalanced datasets by using fusing embeddings of foundation models.
Abstract: While foundation models pretrained on remote sensing data worked successfully in various advanced tasks, their susceptibility to class imbalance and the potential of targeted mitigation strategies remain underexplored. We present an empirical evaluation of six remote sensing foundation models on two SAR ship classification datasets: OpenSARShip and FUSARShip. We demonstrate that commonly used metrics such as accuracy and weighted F1-score mask poor performance on minority classes, and advocate for macro F1-score as a more reliable measure for imbalanced datasets. Our experiments reveal that several foundation models underperform ImageNet-pretrained baselines (ResNet, VGG, ViT). We investigate four fusion strategies to combine embeddings from multiple foundation models in both full and lightweight configurations, where lightweight variants exclude the two poorest-performing models. Lightweight late fusion with informed weighting achieves up to 5.5 percentage point improvements in macro F1 over the best individual foundation models, while reducing computational overhead.
Submission Number: 36
Loading