# Implementation of Graph Backup: Data-Efficient Backup Exploiting Markovian Transitions 
The implementation of vanilla DQN for MiniGrid and MinAtar is based on https://github.com/Kaixhin/Rainbow, under the directory `gridworld` .
The implementation of Rainbow for Atari is based on https://github.com/mila-iqia/spr, under the directory `atari`.

## Install
```bash
conda create -n gb python=3.9
conda activate gb
conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch
pip install -r requirements.txt
# setup atari ROMS
cd atari
wget http://www.atarimania.com/roms/Roms.rar
unrar x Roms.rar
python -m atari_py.import_roms .
```

## Usage

To run mingrid experiments:

```bash
cd gridworld
python core/run.py --id=T-1-1 --exp_group=T-1 --env=MiniGrid-KeyCorridorS3R1-v0 --num_steps 1000000 --seed=1 --disable_noisy --disable_dist --priority-exponent=0.0 --disable_duelling --disable_noisy --distill_steps=1 --buffer_sample=uniform --initialization=distilled --multi-step=10 --backup_target=graph-limited --buffer_key=transition --branching_limit=50 --backup_target_update --discount=0.95 --learning-rate=0.001
```

To run minatar experiments:

```bash
cd gridworld
python core/run.py --id=T-2-1 --exp_group=T-2 --env=Minatar-seaquest --num_steps 1000000 --seed=1 --disable_noisy --disable_dist --priority-exponent=0.0 --disable_duelling --disable_noisy --distill_steps=1 --buffer_sample=uniform --initialization=distilled --multi-step=5 --backup_target=graph-limited --buffer_key=transition --branching_limit=20 --backup_target_update --hidden-size=256 --learning-rate=0.000065 --learn-start=1600 --target-update=8000 --replay-frequency=4
```

To run atari experiments:
```bash
cd atari 
python scripts/run.py --game=breakout --exp_id=T-3-1 --seed=1 --num-logs=10 --spr=0 --backup=graph --augmentation none --target-augmentation 0 --momentum-tau 0.01 --n-step=10 --breath=10 --architecture=spr --learning_rate=0.0001 --limit_sample_method=uniform
```
