FLSC-CI: Federated Learning and Semantic Communication Empowered Multimodal Terminal Collaborative Inferencing Framework for IoT Businesses

Siya Xu, Yonghao Qi, Feng Qi, Shaoyong Guo, Ziyu Zhao

Published: 01 Jan 2026, Last Modified: 13 Mar 2026IEEE Transactions on Network Science and EngineeringEveryoneRevisionsCC BY-SA 4.0

Abstract: Inference tasks based on multimodal data from the Internet of Things (IoT) play an important role in intelligent management. Due to the limited resources of IoT devices, existing edge frameworks struggle to achieve low-energy, high-efficiency accurate inference. This paper introduces a Federated Learning (FL) and semantic communication empowered multimodal terminal collaborative inferencing framework for IoT businesses (FLSC-CI). Firstly, we propose an FL-based Customized model Training Algorithm (FL-CTA) for semantic encoder-decoder models and business inference models. In the semantic extraction phase, high-quality terminals perform local model training, model aggregation, and semantic extraction, while low-quality terminals perform only local model training or semantic extraction. In the business inference model training phase, the edge server synchronously performs multimodal model training by utilizing semantics transmitted from terminals. Furthermore, this paper proposes a Heterogeneous Resource Dynamic Allocation Strategy (HRDAS) for FLSC-CI based on multi-agent deep deterministic policy gradient to manage FL training process. Intelligent agents at cluster heads make customized allocation decisions of system bandwidth and power according to terminals’ service capabilities and model features within the cluster. Simulation results demonstrate that FLSC-CI significantly improves resource utilization and communication efficiency while maintaining high inference accuracy, making it suitable for large-scale heterogeneous IoT deployments.

External IDs:doi:10.1109/tnse.2026.3667621