# HyperMARL Reproduction Guide

This is a fork of the original [HAPPO repository](https://github.com/PKU-MARL/HARL). In this fork, we added HyperMARL to compare with HAPPO and Kaleidoscope (we copy over their [code](https://github.com/LXXXXR/Kaleidoscope)) in MaMuJoCo, in identical settings.

For on-policy experiments, we use 4/6 scenarios from the original paper, including the challenging 17-agent humanoid task. Additional Walker and Hopper variants were excluded as MAPPO and HAPPO performed similarly in these environments.

For off-policy experiments, we use the same environments as Kaleidoscope, which are Ant-v2, HalfCheetah-v2, Walker2d-v2 (overlapping with our IPPO experiments), and Swimmer-v2-10x2 (which represents the MaMuJoCo variant with the highest number of agents they tested on).

# Installation

Follow the instructions from the original repo to install the [repo](https://github.com/PKU-MARL/HARL?tab=readme-ov-file#installation) and [MaMuJoCo](https://github.com/PKU-MARL/HARL?tab=readme-ov-file#installation).

If you issues and can't install Mujoco, please refer to the HAPPO repo. To help, we have also included a .sh script to install Mujoco (run `download_mujoco.sh`) and a file with step by step instructions (copy the content of `mujoco_env_3.9.sh` to your terminal).

Additionaly, install wandb which we use for logging with `pip install wandb`.

# Experiments

## On-Policy - MAPPO as the base algorithm

To run the experiments, you can use the following command:

```bash
# ant
exps/raw_scripts/mujoco/ant/happo.sh
exps/raw_scripts/mujoco/ant/hypermarl.sh
exps/raw_scripts/mujoco/ant/mappo_ind_actor.sh # mappo with independent actor weights
exps/raw_scripts/mujoco/ant/mappo_fups.sh # mappo with shared actor weights - FUPS

# halfcheetah
exps/raw_scripts/mujoco/halfcheetah/happo.sh
exps/raw_scripts/mujoco/halfcheetah/hypermarl.sh
exps/raw_scripts/mujoco/halfcheetah/mappo_ind_actor.sh 
exps/raw_scripts/mujoco/halfcheetah/mappo_fups.sh 

# walker 
exps/raw_scripts/mujoco/walker/happo.sh
exps/raw_scripts/mujoco/walker/hypermarl.sh
exps/raw_scripts/mujoco/walker/mappo_ind_actor.sh
exps/raw_scripts/mujoco/walker/mappo_fups.sh

# humanoid
exps/raw_scripts/mujoco/humanoid/happo.sh
exps/raw_scripts/mujoco/humanoid/hypermarl.sh
exps/raw_scripts/mujoco/humanoid/mappo_ind_actor.sh
exps/raw_scripts/mujoco/humanoid/mappo_fups.sh
```

## Off-Policy - MATD3 as the base algorithm to match Kaleidoscope paper 

To run the experiments, you can use the following command:

```bash
# ant
exps/raw_scripts/mujoco/ant/off_policy/matd3.sh
exps/raw_scripts/mujoco/ant/off_policy/hypermarl_matd3.sh
exps/raw_scripts/mujoco/ant/off_policy/kalei_matd3.sh

# halfcheetah
exps/raw_scripts/mujoco/halfcheetah/off_policy/matd3.sh
exps/raw_scripts/mujoco/halfcheetah/off_policy/hypermarl_matd3.sh
exps/raw_scripts/mujoco/halfcheetah/off_policy/kalei_matd3.sh

# walker
exps/raw_scripts/mujoco/walker/off_policy/matd3.sh
exps/raw_scripts/mujoco/walker/off_policy/hypermarl_matd3.sh
exps/raw_scripts/mujoco/walker/off_policy/kalei_matd3.sh

# manyagent_swimmer
exps/raw_scripts/mujoco/manyagent_swimmer/off_policy/matd3.sh
exps/raw_scripts/mujoco/manyagent_swimmer/off_policy/hypermarl_matd3.sh
exps/raw_scripts/mujoco/manyagent_swimmer/off_policy/kalei_matd3.sh

```