# Instructions to use the code

## Set up the environment.
You need an `anaconda3` environment with python 3.11.5.
```bash
conda create --name name python=3.11.5
conda activate name
```

Install the packages.
```bash
pip3 install -r requirements.txt
```

## Run experiments.
All you need is in `run_cost.py`, which requires several parameters:
- "--dir": specifies the directory in which will be saved the results;
- "--ite": how many iterations the algorithm must do;
- "--alg": the algorithm to run, you can select "cpg", "cpgpe", "npgpd", "rpgpd", "npgpd2", "rpgpd2";
- "--riks": specifies the risk measure, you can select "tc", "cvar", "mv", "chance";
- "--reg": the regularization amount;
- "--var": the exploration amount, it is $\sigma^2$;
- "--pol": the policy to use, you can select "linear" or "softmax";
- "--env": the environment on which the learning has to be done, you can select "swimmer", "half_cheetah", "hopper", "lqr", ""gw_d";
- "--horizon": set the horizon of the problem;
- "--gamma": set the discount factor of the problem;
- "--lr": set the step size for all the variables;
- "--lr_strategy": set the learning rate schedule, you can select "constant" or "adam";
- "--n_workers": specifies how many trajectories are evaluated in parallel;
- "--batch": specifies how many trajectories are evaluated in each iteration;
- "--clip": specifies whether to apply action clipping, you can select "0" or "1";
- "--n_trials": specifies how many run of the same experiments has to be done;
- "--risk_param": the parameter for the risk measure at hand;
- "--c_bounds": the thresholds for the costs;
- "--l_init": intial values for the dual variable;
- "--eta_init": initial values for the additional primal variable;
- "--env_param": parameters for the environment when specified;
- "--alternate": a flag on whether to use alternate ascent/descent.

Here is an example running CPGPE on bi-dimensional CostLQR with total cost constraints:
```bash
python3 run_cost.py --dir /your/path --alg cpgpe --risk tc --risk_param 0 --c_bounds 1 --ite 100 --var 1 --pol linear --env lqr --env_param 2 --horizon 100 --gamma 1 --lr 0.01 0.1 0.01 --lr_strategy adam --n_workers 6 --clip 1 --batch 30 --n_trials 1
```


