DPU-Enhanced Multi-Agent Actor-Critic Algorithm for Cross-Domain Resource Scheduling in Computing Power Network

Shuaichao Wang; Shaoyong Guo; Jiakai Hao; Yinlin Ren; Feng Qi

DPU-Enhanced Multi-Agent Actor-Critic Algorithm for Cross-Domain Resource Scheduling in Computing Power Network

Shuaichao Wang, Shaoyong Guo, Jiakai Hao, Yinlin Ren, Feng Qi

Published: 01 Jan 2024, Last Modified: 14 May 2025IEEE Trans. Netw. Serv. Manag. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: The distribution of computing resources in the Computing Power Network (CPN) is uneven, leading to an imbalance in resource supply and demand within domains, necessitating cross-domain resource scheduling. To address the cross-domain resource scheduling challenge in CPN, this paper presents an Improved Multi-Agent Actor-Critic (IMAAC) resource scheduling approach leveraging Data Processing Unit (DPU) offloading. Initially, we introduce a cross-domain resource scheduling architecture tailored for CPN by leveraging DPU offloading. Specifically, we delegate certain functionalities of the Multi-Agent Deep Reinforcement Learning (MADRL) Agent to DPUs, aiming to mitigate communication costs incurred during the generation of cross-domain scheduling decisions. Second, we introduce the parallel experience ensemble and multi-head attention mechanism in the Multi-Agent Actor-Critic (MAAC) framework to compress the state-space dimensionality of agent association across domains. Finally, we introduce the parallelized dual-policy network structure to mitigate training instability and convergence challenges within the actor and critic networks. Experimental results showcase that IMAAC achieves noteworthy reductions of 5.98%~13.56%, 23.54%~33.55%, and 41.17%~58.88% in total system delay, energy consumption, and the number of discarded tasks, respectively, compared to benchmark experiments.

Loading