Multi-Agent Reinforcement Learning for Distributed Workflow Orchestration at the Tactical Edge

Alessandro Amato, Alessandro Morelli, Mattia Fogli, Raffaele Galliera, Niranjan Suri

Published: 01 Jan 2024, Last Modified: 28 Jul 2025MILCOM 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: The dynamic nature of tactical edge networks has led to the design of architectures that enable real-time data processing and analytics at the edge, to ensure the continuation of operations when the connection to the headquarters is unavailable. However, workflow orchestration faces unique challenges over frequently disconnected, intermittent, and limited (DIL) networks, where traditional approaches, mainly developed for cloud-like environments, lack the flexibility to react promptly to ever-changing conditions. This paper presents a novel decentralized partially observable Markov decision process (DEC-POMDP) formulation for the distributed workflow orchestration problem, where agents need to cooperate to maximize the computation efficiency while reducing the data transmission time. We propose a solution based on multi-agent reinforcement learning (MARL) that leverages graph convolutional reinforcement learning (DGN) and graph attention networks (GAT) to enable agents to share information with each other, capture the network’s structural information, ensure scalability, and eliminate the needs for global knowledge of the network. Training and experiments, which compare our solution with the corresponding constraint satisfaction problem (CSP), are conducted in a simulated 2D urban scenario that mimics nodes’ mobility and communications, showing promising results.