Edge Large AI Model Agent-Empowered Cognitive Multimodal Semantic Communication

Yan Sun, Yinqiu Liu, Shaoyong Guo, Xuesong Qiu, Jiewei Chen, Jiakai Hao, Dusit Niyato

Published: 01 Jan 2026, Last Modified: 13 Mar 2026IEEE Transactions on Mobile ComputingEveryoneRevisionsCC BY-SA 4.0

Abstract: Semantic communications (SemCom) provide efficient transmission for mobile edge computing (MEC) services by extracting critical semantics from raw information. Although widely adopted in various scenarios, existing single-modal SemCom systems struggle to efficiently support edge multimodal data transmission. Additionally, mobile end users have varying communication requirements across different modalities. However, existing work lacks the ability to generate personalized communication policies tailored to diverse intents (Typically, communication policies include bandwidth allocation and modulation and coding schemes, etc.). In this paper, we propose an edge Cognitive SemCom Agent (CSCA) to facilitate edge multimodal SemCom. Specifically, CSCA leverages an edge Large AI Model (LAM) to realize modality alignment and natural language intent understanding. Moreover, we develop a communication planning module to realize the planning capability, which generates personalized wireless communication policies based on LAM’s environment and intent cognition. Particularly, to assess the efficiency of communication policies in multimodal SemCom and capture intent competition, we present a novel indicator named cognitive SemCom quality indicator (CSCQI). Then, we use the denoising diffusion probabilistic model to optimize the generation policy. Extensive experimental results demonstrate that CSCA achieves an average improvement in intent satisfaction rate and semantic accuracy by 42.19% and 29.75% respectively, while reducing communication delay by 33.40% .

External IDs:doi:10.1109/tmc.2025.3590723