Dynamically Optimize MTD Strategy in Satellite Computing Systems Using A2C Reinforcement Learning

Published: 2025, Last Modified: 12 Jan 2026ICASSP 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: The Satellite Computing System (SCS) faces an increasing number of attacks. Although Moving Target Defense (MTD) can effectively mitigate attacks in ground networks, it is not well-suited for SCS due to the highly dynamic nature of both SCS traffic and attackers’ scanning behaviors. In this paper, we propose a dynamic MTD strategy optimization scheme using Advantage Actor-Critic (A2C) reinforcement learning. Specifically, we formulate the MTD strategy optimization for SCS as a Markov Decision Process (MDP). Furthermore, by accounting for the uncertainty in attack behavior changes, we apply A2C reinforcement learning to optimize the MTD strategy within the MDP framework. Experimental results demonstrate that our scheme effectively reduces the frequency of scanning hits, shortens the duration attackers can hold addresses, and minimizes the impact of MTD on quality of service.
Loading