# Semi-Supervised Preference Learning via Risk Analysis

## Datasets and Models
### Dataset list:
[Animals](https://cvml.ista.ac.at/AwA2/)

[RealwordlQA](https://huggingface.co/datasets/xai-org/RealworldQA)

[MMBench](https://huggingface.co/datasets/lmms-lab/MMBench)

[SeedBench](https://huggingface.co/datasets/lmms-lab/SEED-Bench)

[MMStar](https://huggingface.co/datasets/Lin-Chen/MMStar)

[ScienceQA](https://huggingface.co/datasets/lmms-lab/ScienceQA-IMG)

### Model list:
[Qwen2-VL-2B](https://huggingface.co/Qwen/Qwen2-VL-2B-Instruct)

[Llava-1.5-7B](https://huggingface.co/llava-hf/llava-1.5-7b-hf)

[Tulu](https://huggingface.co/allenai/Llama-3.1-Tulu-3-8B)

[Idefics2-8B](https://huggingface.co/HuggingFaceM4/idefics2-8b)

[Idefics3-8B](https://huggingface.co/HuggingFaceM4/Idefics3-8B-Llama3)

[Florence2](https://huggingface.co/microsoft/Florence-2-large-ft)
## DPO
Our DPO code is based on [TRL](https://huggingface.co/docs/trl/v0.17.0/en/dpo_trainer#dpo-trainer). The details are implemented in ```./finetuning/finetune.py```.
## Process Data
The first thing to do is to process all the datasets as jsonl files, so that they can be subsequently processed by the risk rules. Task-specific customization by viewing and modifying process_data.py

## Generate activate vector
Generate the activation matrix used to train the risk model by running ```generate_activate_vector.py```.

## Generate mu
After saving the activation matrix it is necessary to generate the means under each risk feature, this step can be achieved by running ```generate_mu.py```.

## Generate risk label
Run ```generate_risk_label.py```.

## Train Risk model
Run ```train_lrm.py``` to train and save the risk model weights file. Here we public the weights of the trained risk model. It's the pth file that starts with lrm.

## Mapping
This step corresponds to ```./mapping_activate_vector/mapping.py```.

## Generate preference dataset
Run ```./preference_dataset/generate_preference_dataset.py```.

## Evaluate
Our evaluation of the model is divided into preference accuracy and accuracy of the fine-tuned model. Preference accuracies for all models are implemented in ```some_expriments_animals_preference_acc.py``` and ```some_experiment_qa_preference_acc.py```, as the prompts of classification and QA are differed. The evaluation of risk models is implemented separately in `riskmodel_preference_prediction.py`. Model performance is evaluated in `evaluate.py`.

## Preference data
We save the generated preference data to a jsonl file. The corresponding directory is at `./preference_json_data/`.


