Risk-sensitive Multi-agent Reinforcement Learning over adaptive networks
- Keywords: multi-agent reinforcement learning, risk assessment, adaptive networks
- Abstract: Despite the recent advances of the multi-agent reinforcement learning (MARL) setting in which each agent tries to maximize its own utility in collaborative or competitive environments, the trained agents may still suffer from various problems, e.g., the sensitivity to the uncertainty of the environment, getting easily stuck in poor local minima and sensitive to the link failures and channel noise. To tackle this problem, a risk-aware multi-agent reinforcement learning algorithm is proposed with the following contributions: (1) we introduce a conditional value-at-risk (CVaR) extension of the popular multi-agent deep Q-network (MADQN) and multi-agent deep deterministic policy gradient algorithm (MADDPG), for robust policy estimation and learning. (2) we introduce the online distributed noise adaptive estimation methods, i.e., the adaptive-then-combine method over diffusion-based and the alternating direction methods multipliers method (ADMMs) over consensus-based adaptive networks, to improve the robustness to outliers which are modeled as Student-t distribution. (3) we analyze the MARL case with imperfect communication, to improve the robustness of the link failures and channel noise. The proposed algorithm is applied to autonomous driving and other cases. The simulation results are given to show the efficiency of the proposed algorithm.