Abstract: Highlights•Propose C2E-MARL to improve sample efficiency for multi-agent reinforcement learning.•Learn from data generated by contrastive learning to reduce the demand for sample.•Ensemble Q-network to provide better-generalized Q-estimation for efficient training.•Achieve superior sample efficiency and performance in multi-agent scenarios.
Loading