## VTON-VLLM: Aligning Virtual Try-On Models with Human Preferences

### VLLM Fineturn
We fine-tuned our VTON-VLLM based on [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory). The training configuration is located in train_vllm/pixtral12b_lora_sft.yaml.

### Metrics
Our evaluation metrics, Garment Consistency (GC) and Image Quality (IQ), can be computed using the scripts in metric/.
### Align VTON models with Human Preferences
* train_flux_vton_refine.py is used to train the VTON Refinement Model with Fine-grained Supervision.
* train_flux_viton_syn.py is the training script for the generative model on the VITON-SYN dataset.
* tryon_inference.py is the inference script incorporating Test-Time Scaling with Human Preference-Aware Reward.

Our code is constructed based on the codebase (https://github.com/huggingface/diffusers)