<h1 style="text-align: center;">PRIME: Process Reinforcement Through Implicit Rewards</h1>


### Building Foundation
PRIME is built upon [verl](https://github.com/volcengine/verl) framework. verl is a flexible, efficient and production-ready RL training library for large language models (LLMs). We implement PRIME in `recipe/prime`.

### Installation
Please refer to [here](https://verl.readthedocs.io/en/latest/start/install.html) for instruction to install dependencies.

### Run
To fire the training pipeline, you can run the following command,
```
bash recipe/prime/run_prime_qwen.sh
```