# HyperMARL: Adaptive Hypernetworks for Multi-Agent RL
<p align="center">
  <img src="HyperMARL/assets/hypermarl_black.gif" width="80%" alt="HyperMARL overview">
</p>

If your prefer a static image, here is the [PNG version](HyperMARL/assets/hypermarl.png).

---
## Reproducing the results

- For the Specialisaiton Game, Synchronisation Game, Dispersion, Navigation and SMAX experiments, follow [these](HyperMARL/docs/reproduce.md) instructions.
- For the MAMuJoCo experiments, follow [these](HAPPO/README.md) instructions.

---

## TL;DR 📜
**🧩 Problem**  
Independent policies (**NoPS**) scale poorly and are sample-inefficient, while fully shared policies (**FuPS**) are efficient but can collapse to uniform behaviour due to cross-agent interference -- which can be exacerbated when a shared network **couples** observations and agent IDs.

**🛠️ Our approach**  
We propose **HyperMARL** -- a MARL architecture that uses *agent-conditioned hypernetworks* to **decouple** observation- and agent-conditioned gradients and dynamically generate *agent-specific* actor and critic parameters. This enables agents to exhibit diverse or homogeneous behaviours as needed, **without altering the RL learning objective, requiring prior knowledge of the optimal diversity or sequential updates**.

---

### Key Features
| 🔑 | Feature  |
|----|---------|
| 🧬 **Agent-conditioned hypernetwork** | A shared hypernetwork generates per‐agent actor and critic parameters on the fly. |
| 🔀 **Gradient decoupling** | Decouples observation- and agent-conditioned gradients, which empirically reduces gradient variance and cross‐agent interference. |
| 📊 **Competitive results** | Matches or is competitive with NoPS, FuPS (+/– ID) and three diversity-promoting baselines **across a wide range of MARL benchmarks** (including Dispersion and Navigation from VMAS, SMAX, MAMuJoCo, and custom environments), diverse task types (homogeneous, heterogeneous, and mixed), and agent counts (2–20). We also show HyperMARL maintains *behavioural diversity* comparable to NoPS. |
| 🔌 **Easy integration** | Easy integration into existing on- or off-policy algorithms with minimal code changes — no extra loss terms, diversity hyperparameters (i.e. knowing the optimal diversity required for a task), or sequential updates. We include JAX (main) and Pytorch implementations. |




---
## Citation 📝
```bibtex
@article{hypermarl2025,
  title   = {HyperMARL: Adaptive Hypernetworks for Multi-Agent RL},
  author  = {[Author names]},
  journal = {[Journal name]},
  year    = {2025}
}
