TCC-SemCom: A Transformer-CNN Complementary Block-Based Image Semantic Communication

Guo Cheng, Baolin Chong, Hancheng Lu

Published: 01 Jan 2025, Last Modified: 03 Apr 2025IEEE Commun. Lett. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Semantic communication (SemCom), as a paradigm beyond bit communication, is regarded as an effective solution to address the challenges posed by the growing volume of vision-based traffic. Existing semantic image communication methods are mostly based on convolutional neural networks (CNNs) or Transformers, which focus on different structural semantics. Specifically, CNNs with local convolution operations excel at capturing local semantic features, while Transformers based on multi-head attention mechanism, are better at modeling long-range dependencies and global semantic information. To effectively fuse these two models and leverage both advantages, we propose a parallel Transformer-CNN complementary (TCC) block, where CNNs and Transformers are combined to enhance the extraction of both local and global semantic information. Furthermore, we propose a TCC-based SemCom (TCC-SemCom) scheme for wireless image transmission. Experimental results verify that TCC-SemCom significantly outperforms existing schemes in terms of peak signal-to-noise ratio (PSNR) and multi-scale structural similarity index (MS-SSIM).