Optimizing Task Offloading in Dynamic Satellite-Terrestrial Integrated Computing Power Networks: A Time-Space-Aware DRL Approach

Xin Zou, Renchao Xie, Zehui Xiong, Gaochang Xie, Qinqin Tang, Tao Huang, Chau Yuen, Zhu Han

Published: 2026, Last Modified: 27 Jan 2026IEEE Trans. Cogn. Commun. Netw. 2026EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Driven by the increasing demand from emerging applications for wide coverage, low latency, and powerful computing capabilities in networks, the Satellite–Terrestrial Integrated Computing Power Network (ST-CPN) has emerged as a promising solution. However, the dynamic nature of the ST-CPN and the limited resources of individual nodes present significant challenges to the efficient execution of complex applications with multiple interdependent subtasks. This paper investigates the dependent task offloading problem in dynamic ST-CPNs. To tackle the challenges associated with satellite mobility, uneven service distribution, and spatial uncertainty, we propose a Time–Space-Varying Resource Model (TSVRM) to capture the dynamic variations of communication, computation, and storage resources. Spatially, TSVRM predicts link establishment and switching based on satellite trajectories, visibility constraints, and polar region recognition, thereby modeling topology evolution driven by orbital motion. Temporally, a Markov process is used to represent the stochastic evolution of resource states. Building on TSVRM, we develop a Time–Space-Aware Deep Reinforcement Learning (TS-DRL) offloading scheme to determine subtask execution placement. It employs an upward-ranking mechanism for subtask prioritization and a Long Short-Term Memory (LSTM)-based Sequence-to-Sequence (S2S) network to encode structured task features. The network is trained via Proximal Policy Optimization (PPO) to approximate the policy and value functions of the Markov Decision Process, enabling optimized offloading. Simulation results show that our scheme converges effectively and achieves superior QoS compared to baselines, reaching between 94.10% and 98.27% of the optimal performance.

External IDs:dblp:journals/tccn/ZouXXXTHYH26