
# Unsupervised Multi-Agent Diversity With Wasserstein Distance

>In cooperative Multi-Agent Reinforcement Learning (MARL), agents sharing policy network parameters are observed to learn similar behaviors, which impedes efficient exploration and easily results in the local optimum of cooperative policies. In order to encourage multi-agent diversity, many recent efforts have contributed to distinguishing different trajectories by maximizing the mutual information objective, given agent identities. Despite their successes, these mutual information-based methods do not necessarily promote exploration. To encourage multi-agent diversity and sufficient exploration, we propose a novel Wasserstein Multi-Agent Diversity (WMAD) exploration method that maximizes the Wasserstein distance between the trajectory distributions of different agents in a latent representation space. Since the Wasserstein distance is defined over two distributions, we further extend it to learn multiple policies. 
## Requirements

To install requirements:

```setup
pip install -r requirements.txt
```

## Run an experiment

To train the model(s) in the paper, run this command:

```train
python3 src/main.py --config=wmad_smac_parallel --env-config=sc2
```
