# Accelerated Policy Gradient (APG)
**Accelerated Policy Gradient: On the Nesterov Momentum for Reinforcement Learning**


<br/><br/>
## Folder Structure
```
.
├── helper/
│   ├── plot.py
│   └── utils.py
├── mdp_env/
│   ├── bandit_hard.yaml
│   └── bandit_uniform.yaml
├── scripts/
│   ├── run_bandit_hard.sh
│   └── run_bandit_uniform.sh
├── train/
│   ├── APG.py
│   ├── Bellman.py
│   ├── parameters.py
│   ├── PG.py
│   ├── PI.py
│   └── Saver.py
├── .gitignore
├── graph.py
├── LICENSE
├── main.py
├── Readme.md
└── requirements.txt
```
Note: Add `.yaml` in the directory `./mdp_env` if you want to test other MDP / bandit setting.

<br/><br/>
## Environment
- Python 3.8.5
    ```sh
    pip3 install -r requirements.txt
    ```
    or
    ```sh
    pip3 install pyyaml termcolor pandas numpy matplotlib tqdm fastparquet
    ```

<br/><br/>
## Quick Start
- Easily run the following code to perform APG & PG on a [testing MDP env](./mdp_env/test.yaml):
    ```py
    python3 main.py --fname test
    ```
    Note: Specify other arguments [here](./train/parameters.py).


- Run `graph.py` to get more plot:
    ```py
    python3 graph.py --log_dir ./logs/test \
                     --algo APG \
                     --graphing_size 500 1000 \
                     --plot_Summary \
                     --plot_Value \
                     --plot_LogLog \
                     --plot_MomGrad \
                     --plot_OneStep
    ```


<br/><br/>
## Random MDP:
- Easily run the following code to perform APG & PG on a `random MDP`:

    ```py
    python3 main.py --random_mdp \
                    --state_action_num 5 5 \
                    --fname test_random_mdp_5s5a 
    ```
    Note: The information of the random MDP will be recorded at [here](./logs/test_random_mdp_5s5a/args.yaml).

<!-- <center class="half">
    <kbd><img src= width='650'></kbd>
</center> -->
    

<br/><br/>
## Reproducing Results
Run the following code to reproduce the numerical results presented in the paper:
- Change mode before running `.sh`:
    ```sh
    chmod +x ./scripts/{file name}.sh
    ```

- Run:
    ```sh
    ./scripts/run_bandit_hard.sh
    ./scripts/run_bandit_non_monotone.sh
    ./scripts/run_bandit_uniform.sh
    ./scripts/run_mdp_5s5a_hard.sh
    ./scripts/run_mdp_5s5a_uniform.sh
    ```