Policy distillation for efficient decentralized execution in multi-agent reinforcement learning

Yuhang Pei, Tao Ren, Yuxiang Zhang, Zhipeng Sun, Matys Champeyrol

Published: 2025, Last Modified: 09 Nov 2025Neurocomputing 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Highlights•Centralized training uses a dual-attention network with global state information.•Dual-attention differentiates agent policies, preventing homogeneous behavior.•Policy distillation creates lightweight agents for efficient decentralized execution.•Proposed network achieves superior training and efficient execution performance.