# Pytorch Implementation for ROTOC (Rollout Total Correlation for Deep Reinforcement Learning)
 
These instructions were tested in Ubuntu with version 20.04.

## Install dependencies
### Create a conda env
```shell
conda create --name rotoc python=3.9.7
conda active rotoc
```
### Install Mujoco
1. Download the MuJoCo version 2.1 binaries for Linux from [Deepmind](https://github.com/google-deepmind/mujoco/releases?page=4).  
2. Extract the downloaded mujoco210 directory into ~/.mujoco/mujoco210.  
3. Install mujoco-py.
```shell
pip3 install -U 'mujoco-py<2.2,>=2.1'
```

### Install other packages

```shell
conda install pytorch==1.12.1 torchvision==0.13.1 torchaudio==0.12.1 cudatoolkit=11.3 -c pytorch  
pip install git+https://github.com/denisyarats/dmc2gym.git
pip install -r requirements.txt
```

## Running experiments on standard Mujoco tasks
The performance of ROTOC on the Walker Walk task can be obtained by running the following command, which can be modified to test ROTOC on different environments.
```shell
python train_cluster.py --domain_name walker \
                        --task_name walk \
                        --action_repeat 2 \
                        --eval_freq 5000 \
                        --num_train_steps 251000 \
                        --batch_size 128 \
                        --time_step 2 \
                        --rotoc_lr 1e-4 \
                        --intr_mi_coef 0.001 \
                        --omega_rotoc_loss 0.1  \
                        --results_dir ./logs \
                        --seed 0
```


## Running experiments on noisy Mujoco tasks
The noise can be introduced by setting the Flag add_distractor to True and the Flag img_source to "noise": 
```shell
--add_distractor True --img_source noise
```

To evaluate ROTOC on the Walker Walk task with Gaussian noise, run the following command, which can be modified to test ROTOC on different noisy environments.
```shell
python train_cluster.py --domain_name walker \
                        --task_name walk \
                        --action_repeat 2 \
                        --eval_freq 5000 \
                        --num_train_steps 251000 \
                        --batch_size 128 \
                        --time_step 2 \
                        --rotoc_lr 1e-4 \
                        --intr_mi_coef 0.001 \
                        --omega_rotoc_loss 0.1 \
                        --add_distractor True \
                        --img_source noise \
                        --results_dir ./logs \
                        --seed 0
```

## Monitor

The learning progress can be observed by running the following command:
```shell
tensorboard --logdir ./logs --port 6006
```
and then opening browser with 'localhost:6006'.


