Cooperative Partial Task Offloading and Resource Allocation for IIoT Based on Decentralized Multiagent Deep Reinforcement Learning

Fan Zhang, Guangjie Han, Li Liu, Yu Zhang, Yan Peng, Chao Li

Published: 01 Jan 2024, Last Modified: 13 Nov 2024IEEE Internet Things J. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Edge computing has become increasingly important to fulfill the diversified Quality-of-Service (QoS) or Quality-of-Experience (QoE) demands for Industrial Internet of Things (IIoT) applications, such as machine condition monitoring, fault diagnosis, intelligent production scheduling, and production quality control. Due to the heterogeneity of IIoT systems, it is of urgent necessity to concentrate on the cloud–edge–end cooperative partial task offloading and resource allocation (CPTORA) problem for realizing workload balancing, efficient resource utilization, and better QoS/QoE of IIoT applications. However, the challenge lies in how to make real-time, accurate, decentralized task offloading (TO) and resource allocation (RA) decisions for dynamic and device-intensive IIoT. Therefore, this work examines the CPTORA problem for IIoT, aiming at minimizing its long-run overall delay and energy costs. To lower the problem complexity, this problem is decomposed into the TO subproblem and the RA subproblem. Then, an improved soft actor–critic-based decentralized multiagent deep reinforcement learning (MADRL) algorithm is proposed to address the TO subproblem, where each IIoT device can learn its globally optimal policy and make its decisions independently. This algorithm innovatively combines the divergence regularization, the distributional reinforcement learning, and the value function decomposition methods to improve convergence speed and accuracy of the existing MADRL methods. After receiving the TO decisions of every IIoT device, every edge server employs the Lagrange multiplier method and Karush–Kuhn–Tucker condition to solve its RA subproblem. The experimental results show that the proposed algorithm decreases the overall delay and energy costs more effectively, compared to the other state-of-the-art MADRL approaches.