# PyTorch Implementation of MTC: Maximum Total Correlation Reinforcement Learning

## Install dependencies
These instructions were tested in Ubuntu with version 20.04.
```shell
conda create --name mtc python=3.8  
conda activate mtc  
conda install pytorch==1.12.1 torchvision==0.13.1 torchaudio==0.12.1 cudatoolkit=11.3 -c pytorch  
conda install tensorboard  
pip install mujoco==2.3.7  
pip install dm_control==1.0.9    
pip install git+https://github.com/denisyarats/dmc2gym.git   
pip install gym==0.22.0
pip install numpy==1.23.5
pip install absl_py==1.4.0  
pip install joblib==1.2.0
pip install Pillow==9.4.0
pip install termcolor
pip install experiment-launcher==1.3
```

## Running experiments
The performance of MTC on the Hopper Stand task can be obtained by running `bash script.sh` from the root of this directory. The script.sh file contains the following commands, which can be modified to test MTC on different environments.
```shell
python train.py --domain_name hopper \
                --task_name stand \
                --results_dir ./logs\
                --seed 0 \
                --eval_freq 20000 \
                --num_train_steps 1000000 \
                --horizon 15 \
                --kl_constraint -3.0
```

The policy can be saved by setting the Flag save_model to True: '--save_model True'.


The learning progress can be observed by running the following line:
```shell
tensorboard --logdir ./logs --port 6006
```
and then opening browser with 'localhost:6006'.

## Baseline results
The results for baselines can be obtained by running their official codes.

**SAC**: The official [code](https://github.com/denisyarats/pytorch_sac) from  "Improving Sample Efficiency in Model-Free Reinforcement Learning
from Images, AAAI 2021" was used to obtain the results for SAC.

**LZ-SAC**: We used the official [code](https://github.com/tankred-saanum/simple_priors) presented in  "Reinforcement Learning with Simple Sequence Priors, NeurIPS 2023" to obtain the results for LZ-SAC.

**RPC**: We used the official [implementation](https://github.com/google-research/google-research/tree/master/rpc) of "Robust Predictable Control, NeurIPS 2021" for evaluating the orignal RPC.
