Bridging Semantic Scale Gaps in Image Transmission Through Multi-Scale Joint Perception and Generation
Abstract: Semantic communication, leveraging deep network-based Joint Source-Channel Coding (JSCC), has garnered increasing attention in recent years. However, existing methods are primarily suitable for transmitting single-scale semantics such as pixels, but not for adaptively fusing multi-scale semantics such as objects and scenes. Owing to substantial variations in data volume across different semantic scales, selecting the appropriate semantic scale for transmission based on varying Channel State Information (CSI) can significantly enhance the efficiency of conveying semantic information. This letter introduces a cross-scale Generative Semantic Communication (GSC) method for image transmission, named BriGSC. Under the constraints of CSI, our method can jointly perceives textual and visual features to represent semantics at different scales, achieves rate-adaptive encoding, transmission, decoding and image generation. The experiment results show that compared with semantic communication methods based on deep learning (SwinJSCC) and generative models (SGD-JSCC), our method has better competitive noise resistance and coding efficiency through jointly encoding multi-scale semantic features. Under various channel conditions, the average values of FID and LPIPS were 35% and 25% lower than SwinJSCC and SGD-JSCC respectively. The code is available at https://github.com/AsanoSaki/BriGSC.
External IDs:dblp:journals/wcl/GaoYYLLX25
Loading