torch
accelerate
codetiming
datasets
flash-attn>=2.4.3
liger-kernel
mathruler
numpy
omegaconf
pandas
peft
pillow
pyarrow>=15.0.0
pylatexenc
qwen-vl-utils
ray
tensordict
torchdata
transformers>=4.49.0
vllm>=0.7.3
wandb
rouge
Levenshtein