# AdaDQN - Meta-gradient RL comparison

## Installation guidelines
Python $3.11$ is required for installing the dependencies.
CPU installation:
```bash
python3 -m venv env_cpu
source env_cpu/bin/activate
pip install --upgrade pip setuptools wheel
pip insatll swig==4.2.1
pip install -e .[dev]
```
GPU installation:
```bash
python3 -m venv env_gpu
source env_gpu/bin/activate
pip install --upgrade pip setuptools wheel
pip install swig==4.2.1
pip install -e .[dev,gpu]
```
To verify the installation, run the tests as:```pytest```

In case any issue arises, the exhaustive list of all dependencies is reported in the ```requirement.txt``` file.

## Code structure
Each folder has a different purpose:
- The _experiments_ folder contains the skeleton of every algorithm (experiments/base/{adadqn, dqn, metadqn}.py). Each subfolder is dedicated to a specific environment and hosts the starting point for each algorithm. The logs and the experiment's metrics will be stored in the experiments/{environment}/{algorithm}/logs and experiments/{environment}/{algorithm}/exp_output folders.
- The _launch\__job_ folder contains the scripts to launch the experiments. Two modes are available: local runs via the files named launch_job/{environment}/local_{algorihtm}.sh and remote runs via the files named launch_job/{environment}/cluster_{algorihtm}.sh. The local runs are launched in a tmux terminal, the remote runs rely on Slurm.
- The _slimdqn_ folder contains the backbone of the sckeleton. The environments are created from the slimdqn/environments folder, each algorithm is implemented in the slimdqn/networks folder, and the code for the replay buffer is stored in slimdqn/sample_collection. 

## Running scripts
Running the script launch_job/lunar_lander/specific.sh trains locally an AdaDQN and Meta-gradient DQN agent for $5 \times 10^{5}$ environment steps and $1$ seed on the Lunar Lander environment.

Running the script launch_job/atari/specific.sh trains locally an AdaDQN and Meta-gradient DQN agent for $40$M frames and $1$ seed on the game BattleZone. This script was used to generate Figure 8.
