# README

The code is adapted on the basis of [https://github.com/chenhongge/SA_DQN](https://github.com/chenhongge/SA_DQN).

Basically, we evaluate models trained using empirically robust RL methods. The trained models can be obtained from the following anonymous Google Drive link [https://drive.google.com/file/d/1cnNwzRuS04chMCxenpcz0I6TuQ2YaDgr/view?usp=sharing](https://drive.google.com/file/d/1cnNwzRuS04chMCxenpcz0I6TuQ2YaDgr/view?usp=sharing) and uncompressed to the  `models/` directory. The configuration files are placed under the `config/` directory.

Our repo mainly consists of three certification algorithms:**CROP-LoAct**, **CROP-GRe**, and **CROP-LoRe**. We next present the example commands for running the three certification algorithms, as well as the empirical attacks against two policies.

## Certification algorithms

### CROP-LoAct

We first run the pre-processing step to obtain the output range of the Q-network, e.g.,

```bash
python estimate_q_range.py --config config/Freeway_cov.json test_config:load_model_path=models/Freeway-convex.model test_config:m=1000 test_config:sigma=1.00 test_config:smooth=true test_config:num_episodes=10
```

Then, we update the configuration file `config_v_table.py` and run the per-state certification algorithm, e.g.,

```bash
python certify_r.py --config config/Freeway_cov.json test_config:load_model_path=models/Freeway-convex.model test_config:m=10000 test_config:sigma=1.00 test_config:smooth=true test_config:max_frames_per_episode=500 test_config:num_episodes=10
```

The results are stored in files with the suffix `_R-list-{i}.pt`.

### CROP-GRe

Example command to run the global smoothing algorithm:

```bash
python test_GS.py --config config/Freeway_cov.json test_config:load_model_path=models/Freeway-convex.model test_config:num_episodes=10000 test_config:max_frames_per_episode=500 test_config:m=1 test_config:smooth=true test_config:eps=0.5 test_config:GS=true
```

### CROP-LoRe

Example command to run the adaptive search algorithm: 

```bash
python test_tree.py --config config/Freeway_cov.json test_config:load_model_path=models/Freeway-convex.model test_config:m=10000 test_config:sigma=1.00 test_config:smooth=true training_config:use_async_env=false test_config:max_frames_per_episode=200
```

The results are stored in the file with the suffix `_certify-map.pt`.

## Empirical Attacks

### Attacks against the locally smoothed policy $\tilde\pi$

Example command:

```bash
python test_attack.py --config config/Freeway_cov.json test_config:load_model_path=models/Freeway-convex.model test_config:m=10000 test_config:sigma=1.00 test_config:smooth=true test_config:max_frames_per_episode=200 test_config:attack_config:params:epsilon=0.25 test_config:num_episodes=1 training_config:use_async_env=false
```

### Attacks against the $\sigma$-randomized policy $\pi\prime$

Example command:

```bash
python test_attack_global.py --config config/Freeway_cov.json test_config:load_model_path=models/Freeway-convex.model test_config:m=1 test_config:sigma=0.75 test_config:smooth=true test_config:max_frames_per_episode=500 test_config:attack_config:params:epsilon=0.08 test_config:num_episodes=101 training_config:use_async_env=true
```

