### Environment Setup

```bash
# Clone and install
Basic Info: Python 3.12 Cuda 12.4 torch 2.6.0
cd code_for_iclr
pip install -r requirements.txt
pip install -e .[gpu,test,math,vllm]
```

### Training

```bash
bash scripts/train/run_grpo_qwen2.5.sh
```
run with your custom settings
