# BNPO  
**BNPO: Beta Normalization Policy Optimization**

### Data Preparation
Prepare the data by running:

```python
python recipe/bnpo/src/data_prepare.py
```

### Training
Train the model by running:

```bash
bash recipe/bnpo/run.sh
```

### Acknowledgement
This project is based on [verl](https://github.com/volcengine/verl)
