# Fuz-RL

- This is the official implementation for [Fuz-RL: A Fuzzy-Guided Robust Framework for Safe Reinforcement Learning under Uncertainty]. In this paper, a novel fuzzy-guided, robust framework for safe RL, termed **Fuz-RL**, is proposed, which integrates fuzzy measure and Choquet integrals into safe RL to enhance robust safe performance in uncertain environments. 

  ![Fuz-RL-frame](./imgs/Fuz-RL-framework.png)

- The training code is based on [Spinning Up](https://github.com/openai/spinningup) and the training environments is based the [Safe-Control-Gym](https://github.com/utiasDSL/safe-control-gym) suite which includes physics-based CartPole and Quadrotor [Gym](https://gym.openai.com/) environments (using [PyBullet](https://pybullet.org/wordpress/)) with symbolic a priori dynamics.

![systems](./imgs/systems.png)

## Getting Start

- The training code is in the folder './src' and the environment parameters can be customized for specific tasks under "./envs/safe_control_gym/config_overrides" by adjusting the respective configuration files.
- We provide three safe/robust RL baselines ([PPO-L](https://cdn.openai.com/safexp-short.pdf), [CPPO](https://www.ijcai.org/proceedings/2022/0510.pdf), [CUP](https://arxiv.org/pdf/2209.07089.pdf)) and the corresbonding Fuz-RL methods (Fuz-PPOL, Fuz-CPPO, Fuz-CUP), are based on [Spinning Up](https://github.com/openai/spinningup) (we delete unnecessary files to make the code clearer)
- You first should install Spinning Up by

```python
cd src
pip install -e .
```

- Then, you need to install the necessary packages by running:

```python
pip install -r requirements.txt
```

## Train

- You can run the training code like:

```python
python train_fuzzy.py --alg ppolag --env CartPole --task stab --epoch 500 --len 150
python train_fuzzy.py --alg fuzppolag --env CartPole --task stab --epoch 500 --len 150 --level 10
```

- For CPPO/FuzCPPO, you can also adjust parameters like **--cppo_beta**, **--cppo_nu_start** and so on. 

```python
python train_fuzzy.py --alg cppo --env Quadrotor --task track --epoch 1000 --len 250 --cppo_beta 100 --cppo_nu_start 10
python train_fuzzy.py --alg fuzcppo --env Quadrotor --task track --epoch 1000 --len 250 --cppo_beta 100 --cppo_nu_start 10 --level 10
```

- For CUP/FuzCUP, you can also adjust parameters like **--cup_lambda**,  **--cup_nu** and so on.

```python
python train_fuzzy.py --alg cup --env Quadrotor --task stab --epoch 1000 --len 250 --cup_lambda 0.95 --cup_nu 0.2
python train_fuzzy.py --alg fuzcup --env Quadrotor --task stab --epoch 1000 --len 250 --cup_lambda 0.95 --cup_nu 0.2 --level 5
```

## Test

During the test phase, you need to import the path of the trained model into "**fpath_list**", and match it with the used algorithm in "**alg_list**", the corresponding environment in "**agent_list**", and the task in "**task_list**".

- Evaluate the performance under observation disturbance (white noise)

```python
python test_fuzzy.py --alg_list PPOL FuzPPOL --env_list CartPole --task_list Stab --disturb_part observation --disturb_type white_noise
```

- Evaluate the performance under dynamcis disturbance (periodic noises)

```python
python test_fuzzy.py --alg_list PPOL FuzPPOL --env_list CartPole --task_list Stab --disturb_part dynamcis --disturb_type periodic
```

- Evaluate the performance under action disturbance  (impluse noises)

```python
python test_fuzzy.py --alg_list PPOL FuzPPOL --env_list CartPole --task_list Stab --disturb_part action --disturb_type impluse
```

The default test is conducted with 10 episodes under seeds ranging from 0 to 9. You can also adjust this by modifying the **--test_seed** and **--num_eps** parameters.