# MORL-FB

## Discrete Environment


### Code Structure

```bash
discrete/
    ├── agent.py --- the training agent of discrete MORL-FB
    ├── base.py --- the structure of trajectory replay buffer
    ├── main.py --- main execution file for MORL-FB algorithms
    ├── module.py --- the model of MORL-FB
    ├── multi_step.py --- the structure of trajectory replay buffer
    ├── prefs/ --- testing preference generated by leveraging the Dirichlet distribution
    ├── requirements.txt --- the require package for MORL-FB
    ├── test.py --- main execution file for testing model with MORL-FB algorithms
    └── utils.py --- utility functions
```

First of all, go to discrete directory

```bash
cd discrete
```

### Requirements
* Python version : tested in Python 3.9.16
* Operation Systems : Ubuntu 20.04
* pytorch version : 2.0.1

Install other required packages:

```bash
pip install -r requirements.txt
```

### Usage
* How to Run ? 

```python3
python main.py --env_name deep-sea-treasure-v0 --seed 10 --cuda_device 0 --project_name "MORL-FB"
```

* Config:
  * env_name : environment name for training
  * seed : random seed
  * cuda_device : cuda device id
  * project_name : wandb project name

* How to Test ?

  * Model will be saved in format /log/{environment_name}/{time_str}\_MORL-FB_{environment_name}/model_{steps}.pth

```python3
python test.py --env_name deep-sea-treasure-v0 --model_name {time_str}_MORL-FB_{environment_name} --steps 3000000 --output_path rewards/MORL-FB/output.npy --cuda_device 0
```

* Config:
  * env_name : environment name for testing
  * model_name :  model directory name
  * steps : model steps for file name
  * output_path : path for saving testing results
  * cuda_device : cuda device id


## Continuous Environment

### Code Structure

```bash
continuous/
    ├── base.py --- the structure of trajectory replay buffer
    ├── environments/ --- the environment file for hopper2d, hopper4d and humanoid5d
    ├── main.py --- main execution file for MORL-FB algorithms
    ├── mo_agent.py --- the training agent of continuous MORL-FB
    ├── module.py --- the model of MORL-FB
    ├── multi_step.py --- the structure of trajectory replay buffer
    ├── prefs/ --- testing preference generated by leveraging the Dirichlet distribution
    ├── requirements.txt --- the require package for MORL-FB
    ├── test.py --- main execution file for testing model with MORL-FB algorithms
    └── utils.py --- utility functions
```


First of all, go to continuous directory

```bash
cd continuous
```

### Requirements
* Python version : tested in Python 3.9.16
* Operation Systems : Ubuntu 20.04
* pytorch version : 2.0.1

Install other required packages:

```bash
pip install -r requirements.txt
```

### Usage
* How to Run ? 

```python3
python main.py --env_name mo-halfcheetah-v4 --seed 10 --cuda_device 0 --project_name "MORL-FB"
```

* Config:
  * env_name : environment name for training
  * seed : random seed
  * cuda_device : cuda device id
  * project_name : wandb project name
  * 

* Tested Environment name
  * Halfcheetah2d : mo-halfcheetah-v4
  * Walker2d : mo-walker2d-v4
  * Hopper2d : mo-hopper2d-v0
  * Hopper3d : mo-hopper-v4
  * Hopper4d : mo-hopper4d-v0
  * Ant3d : mo-ant-v4
  * Humanoid2d : mo-humanoid-v4
  * Humanoid5d : mo-humanoid5d-v0

* How to Test ?

  * Model will be saved in format /log/{environment_name}/{time_str}\_MORL-FB_{environment_name}/model_{steps}.pth

```python3
python test.py --env_name deep-sea-treasure-v0 --model_name {time_str}_MORL-FB_{environment_name} --steps 3000000 --output_path rewards/MORL-FB/output.npy --cuda_device 0
```

* Config:
  * env_name : environment name for testing
  * model_name :  model directory name
  * steps : model steps for file name
  * output_path : path for saving testing results
  * cuda_device : cuda device id

## Evaluation Metrics Calculation

### Requirements
* Python version : tested in Python 3.9.16
* Operation Systems : Ubuntu 20.04
* pytorch version : 2.0.1

Install other required packages:

```bash
pip install -r requirements.txt
```
### Usage
* How to Run hv.py? 

```python3
python hv.py --pref pref/mo-halfcheetah.npy --ref 0 -8000 --data rewards/FB/mo-halfcheetah.npy
```

* Config:
  * pref : preference set use on calculating hypervolumn(HV) and utility(UT).
  * ref : reference point on calculating hypervolumn(HV).
  * data : testing rewards for calculating hypervolumn(HV) and utility(UT).

* How to Run ed.py? 

```python3
python hv.py --pref pref/mo-halfcheetah.npy --base rewards/FB/mo-halfcheetah.npy --others rewards/Q-Pensieve/mo-halfcheetah.npy
```
* Config:
  * pref : preference set use on calculating episodic dominance(ED).
  * base : rewards used as base on calculating episodic dominance(ED).
  * others : rewards for calculating episodic dominance(ED).
