Abstract: Centralized training with decentralized execution (CTDE) is a widely used learning paradigm that has achieved significant success in complex tasks. Drawing inspiration from human team cooperative learning, we propose a novel paradigm that facilitates a gradual shift from explicit communication to tacit cooperation. In the initial training stage, we promote cooperation by sharing relevant information among agents and concurrently reconstructing this information using each agent's local trajectory in a self-supervised way. We then combine the explicitly communicated information with the reconstructed information to obtain mixed information. Throughout the training process, we progressively decrease the proportion of explicitly communicated information, facilitating a seamless transition to fully decentralized execution without communication.
Loading