View-Semantic Transformer With Enhancing Diversity for Sparse-View SAR Target Recognition

Zhunga Liu, Feiyan Wu, Zaidao Wen, Zuowei Zhang

Published: 2023, Last Modified: 21 Mar 2026IEEE Trans. Geosci. Remote. Sens. 2023EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: With the rapid development of supervised learning-based synthetic aperture radar (SAR) target recognition technology, it is easy to find that the recognition performance is proportional to the number of training samples. However, the biased data distribution and under-representation of the model caused by incomplete data within categories exacerbate the challenge of SAR interpretation. In this article, we propose a new view-semantic transformer network (VSTNet) that generates synthesized samples to complete the statistical distribution of training data and improve the discriminative representation of the model. First, SAR images from different views are encoded into a disentangled latent space, which allows us to synthesize data with more diverse views by manipulating view-semantic features. Second, the synthesized data as a complement effectively expands the training set and alleviates the overfitting problem of limited data in sparse views. Third, the proposed method unifies SAR image synthesis and SAR target recognition into an end-to-end framework to boost their performance against each other. Experiments conducted on moving and stationary target acquisition and recognition (MSTAR) data demonstrate the robustness and effectiveness of the proposed method.