In2NeCT: Inter-class and Intra-class Neural Collapse Tuning for Semantic Segmentation of Imbalanced Remote Sensing Images
Abstract: Remote sensing images (RSIs) are frequently characterized by multi-scale inter-class objects and inconsistently distributed objects due to scene limitations, which would cause a significant data imbalance challenging the corresponding semantic segmentation. Recent methods have leveraged various deep learning techniques to capture high-quality representations for RSI semantic segmentation, but are hardly capable of addressing the afore-mentioned challenge given their limited explorations towards the mechanisms behind the representations. The recently discovered Neural Collapse (NC) phenomenon in computer vision models suggests the simplex equiangular tight frame (ETF) as the optimal representation structure, which has motivated us to observe that the optimal structure of last-layer representations is disrupted and inter-class representations for minor classes tend to become closer to each other beacuse of data imbalance. To address these issues, we propose Inter-class and Intra-class Neural Collapse Tuning (In2NeCT) to optimize the representations that satisfy the simplex ETF, which facilitates the discrimination of inter-class representations and the coherence of intra-class representations. Extensive experiments on three datasets demonstrate that our In2NeCT consistently leads to significant improvements in performance and outperforms the state-of-the-art methods.
Loading