# FuseRL: Dense Preference Optimization for Heterogeneous Model Fusion

## Installation

To set up the environment, run the following command:

```bash
pip install -e .
```

## Training

Navigate to the `scripts` directory:

```bash
cd scripts
```

### Stage 1: FuseSFT

To train FuseSFT, run:

```bash
sh train_fusesft.sh
```

### Stage 2: FusePO

To train FusePO, choose one of the following scripts:

- For **FuseDPO**:
  ```bash
  sh run_fusedpo.sh
  ```

- For **FuseSimPO**:
  ```bash
  sh run_fusesimpo.sh
  ```

- For **FuseRLOO**:
  ```bash
  sh run_fuserloo.sh
  ```

---
