# Conservative and Adaptive Penalty for Model-Based Safe Reinforcement Learning

This repository contains the code for reproducing the gridworld experiments in our anonymous submission titled "Conservative and Adaptive Penalty for Model-Based Safe Reinforcement Learning". 

## Usage 
To install all dependencies with Anaconda run `conda env create -f environment.yml` and use `source activate cap` to activate the environment. 

To replicate experiments on HalfCheetah, you can run

**CAP**
```
python train_cem.py --algo cem --env HalfCheetah-v3 --cost_lim 152 \
--cost_constrained --penalize_uncertainty --learn_kappa --seed 1
```

**CAP with fixed kappa**
```
python train_cem.py --algo cem --env HalfCheetah-v3 --cost_lim 152 \
--cost_constrained --penalize_uncertainty --kappa 1.0 --seed 1
```

**CCEM**
```
python train_cem.py --algo cem --env HalfCheetah-v3 --cost_lim 152 \
--cost_constrained --seed 1
```

**CEM**
```
python train_cem.py --algo cem --env HalfCheetah-v3 --cost_lim 152 \
--seed 1
```

## Acknowledgement

This repository contains code from the following repositories:
[PETS](https://github.com/quanvuong/handful-of-trials-pytorch)

We thank the
authors and contributors for open-sourcing their code.