
# Robust_Distributional_RL
Code of the paper **Robust RL with a Distributional Risk-averse Formulation.**

This is the implementation of the following paper, linking robustness and regularization for continuous and discrete environments.

The code is run on different discrete and continuous environments. Cartpole-v1 and Acrobot v1 for discrete action space.
Walker2d-v3, Hopper-v3, and HalfCHeetah-v3 for the continuous case.

You need to install first Mujoco environment that can be downloaded [here](https://github.com/deepmind/mujoco).

Our implementation is based on Stable-Baselines3 [package](https://stable-baselines3.readthedocs.io/en/master/) and on [SB3 contrib](https://stable-baselines3.readthedocs.io/en/master/guide/sb3_contrib.html.) for both [TQC](https://arxiv.org/pdf/2005.04269.pdf) and [QRDQN](https://arxiv.org/abs/1710.10044)  or other PPO and SAC algorithms.


# Repository structure : 3 packages.

- Stable-baselines3 for classic RL algorithms
- A new version of SB3-contrib for QR-DQN and TQC algorithms with standard deviation penalization.
- Robust_RL with main scripts and results.


In Robust_RL :

The **results_plot** directory in **Robust_RL** contains .csv files of results and plotted figures of the paper in results_plot/all_plot
In the **multiprocessing_main** , files for running experiences for both discrete and continuous cases can be found: multiprocess_continuous.py, multiprocess_cartpole.py , multiprocess_acrobot.py .

Predefined hyperparameters of the paper are in the default parameters of the argparser except for the degree of penalization.

# Further details for installation 

Parameters of the algorithms are predefined in the script.


- 1) Install the particular stable-baselines3 version from this repo using in the stable-baselines3 repo :

```bash
$ pip install -e  . 
``` 

- 2) Do the same for sb3-contrib
- 3) Install **Robust_RL** finally with the same command.





