
# Directional-based Wasserstein Distance for Efficient Multi-Agent Diversity

>In the domain of cooperative Multi-Agent Reinforcement Learning (MARL), agents typically share the same policy network to accelerate training. However, the use of shared policy network parameters among agents often leads to similar behaviors, restricting effective exploration and resulting in suboptimal cooperative policies. To promote diversity among agents, recent works have focused on differentiating trajectories of different agents given agent identities by maximizing the mutual information objective. However, these methods do not necessarily enhance exploration. To promote efficient multi-agent diversity and more robust exploration in multi-agent systems, we introduce a novel exploration method called Directional Metric-based Diversity (DMD). This method aims to maximize an inner-product-based Wasserstein distance between the trajectory distributions of different agents in a latent trajectory representation space, providing a more efficient and structured Wasserstein distance metric. Since directly calculating the Wasserstein distance is intractable, we introduce a kernel method to compute it with low computational cost. Empirical evaluations across a variety of complex multi-agent scenarios demonstrate the superior performance and enhanced exploration of our method, outperforming current state-of-the-art methods.
## Requirements

To install requirements:

```setup
pip install -r requirements.txt
```

## Run an experiment

To train the model(s) in the paper, run this command:

```train
python3 src/main.py --config=dmd_smac_parallel --env-config=sc2
```
