# Instructions

## Set up the environment.
Install an `anaconda3` environment with python 3.11.5.
```bash
conda create --name name python=3.11.5
conda activate name
```

Install the packages.
```bash
pip3 install -r requirements.txt
```

## Run experiments.
It is needed to run `run.py`, which requires several parameters:
- "--dir": specifies the directory in which will be saved the results;
- "--ite": how many iterations the algorithm must do;
- "--alg": the algorithm to run, you can select "pg" or "off_pg";
- "--window_length": the window length for off-policy gradient (default: 5);
- "--var": the exploration amount, it is $\sigma^2$;
- "--pol": the policy to use, you can select "linear" or "nn";
- "--env": the environment on which the learning has to be done, you can select "swimmer", "half_cheetah", "reacher", "humanoid", "ant", "hopper", "lqr", "pendulum", "cartpole";
- "--horizon": set the horizon of the problem;
- "--gamma": set the discount factor of the problem;
- "--lr": set the step size;
- "--lr_strategy": set the learning rate schedule, you can select "constant" or "adam";
- "--n_workers": specifies how many trajectories are evaluated in parallel;
- "--batch": specifies how many trajectories are evaluated in each iteration;
- "--clip": specifies whether to apply action clipping, you can select "0" or "1";
- "--n_trials": specifies how many run of the same experiments has to be done;
- "--lqr_state_dim": state dimension for the LQR environment (default: 1);
- "--lqr_action_dim": action dimension for the LQR environment (default: 2);
- "--test": whether to run in test mode, you can select "0" or "1";
- "--weight_type": the type of weight to use in the off-policy gradient, you can select "BH", "MIW", or "RTPG";
- "--from_seed": initial seed for trials (default: 0).

Here is an example running RTPG on Cart Pole:
```bash
python3 run.py --dir /your/path --alg off_pg --ite 100 --var 0.3 --pol linear --env cartpole --horizon 200 --gamma 1 --lr 0.01 --lr_strategy adam --n_workers 6 --clip 1 --batch 30 --n_trials 1 --weight_type RTPG --window_length 8
```


