PolSARConvMixer: A Channel and Spatial Mixing Convolutional Algorithm for PolSAR Data Classification

Ali Jamali, Swalpa Kumar Roy, Bing Lu, Avik Bhattacharya, Pedram Ghamisi

Published: 01 Jan 2024, Last Modified: 13 Nov 2024IGARSS 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Given the exceptional effectiveness of deep Convolutional Neural Networks (CNNs) in computer vision, there has been a recent surge of interest in employing CNNs for various applications in image classification. Additionally, scientists are exploring the potential of vision transformers for Earth observation applications, owing to their recent tremendous success. However, a major challenge with vision transformers is their increased demand for training data compared to CNN classifiers. Furthermore, vision transformers exhibit quadratic complexity and necessitate substantial hardware resources. In the context of PolSAR image classification, we propose the PolSARConvMixer—a fundamental framework that segregates the mixing of spatial and channel dimensions, maintains uniform size and resolution across the network and directly processes PolSAR image patches as input. Our experiments on two PolSAR data benchmarks, namely Flevoland and San Francisco, demonstrate the significant superiority of the developed PolSARConvMixer over several other algorithms, including AlexNet, ResNet, FNet, a 2D CNN, and PolSARFormer.