# State-wise Constrained Policy Optimization 



## Simulator Installation

Install [mujoco_py](https://github.com/openai/mujoco-py), see the mujoco_py documentation for details. Note that mujoco_py **requires Python 3.6 or greater**.

Afterwards, simply install Safety Gym Arm by:

```
cd safety-gym-arm

pip install -e .
```

## Environment Installation

```
conda create --name safebench --file requirements.txt
```


## Policy Training
Take SCPO training for example:
```
cd train/scpo

conda activate safebench

python scpo.py --task goal8_noconti --seed 1
```

## SCPO Policy Video Production
After training finished:
```
python scpo_video.py --model_path logs/<scpo log>/<scpo log specific seed>/pyt_save/model.pt --task <experiment name> --video_name <video name> --max_epoch <max epoch>            
```

## Plot the Training Curve 
```
cd train
mkdir comparison
(copy the log you want to visualize into the comparison/ folder)
python utils/plot.py comparison/ --title test --reward --cost
```