Abstract: Optimizing large-scale logistics is computationally challenging due to its scale and requirement to be robust to stochastic and time-varying weather disturbances. However, prior research in multi-agent reinforcement learning (MARL) does not address scenarios that capture complexity of logistics operations influenced by dynamic weather patterns. To address this gap, we suggest a new MARL environment, $\textsc {Marine}$ that has two types of agents equipped with limited resources and integrates real wave data to model the influences of weather on the replenishment at sea (RAS) operation. To this end, we propose SchedHGNN, a novel MARL algorithm that incorporates a heterogeneous graph neural network and an intrinsic reward scheme to enhance agent coordination and mitigate challenges induced by environment non-stationarity. Our results show that the combination of effective RAS scheduling and improved communication enables our model to outperform competitive baselines by up to 37.8%. This achievement marks a significant advancement in applying MARL to complex, real-world logistics scenarios.
Loading