

## ✨ EmoPrefer

## 🚀 EmoPrefer-Data
**This is the first multimodal preference dataset centered on human emotions.** For this dataset, we provide two emotion descriptions for each video and recruit multiple expert annotators to label preferences. Only samples with unanimous agreement among all annotators are retained, ensuring high-quality preference annotations. The raw dataset is based on MER2024, which is available at: https://huggingface.co/datasets/MERChallenge/MER2024
```bash
dataset
├── video # download from MER2024
├── audio # download from MER2024
├── preference_threestrict.csv
├── preference_threestrict_reverse.csv
├── preference_threestrict_with_modelnames.csv
├── license.txt # license for this dataset
```


## 🗝️ EmoPrefer-Bench
**This is the first benchmark for evaluating MLLMs' performance in emotion preference prediction.** In this benchmark, we conduct a comprehensive evaluation of different MLLMs and prompting techniques, further exploring strategies to enhance their alignment with human preferences. Using this dataset and benchmark, we reveal the performance of MLLMs in emotion preference prediction.

### Requirements and Installation
```bash
conda env create -f environment.yml
```

### Main Code
Take 'qwen25omni_7b' as an example. Please download weights and change config.model2path['qwen25omni_7b'] to your own path.

```bash
CUDA_VISIBLE_DEVICES=0 python -u main_dpo_batch.py --model='qwen25omni_7b' --input_type='audiovideo' --output_type='preferencestrict' --prompt='normal' --totalround=2
CUDA_VISIBLE_DEVICES=0 python -u main_dpo_batch.py --model='qwen25omni_7b' --input_type='audiovideo' --output_type='preferencestrict' --prompt='cot'    --totalround=2
CUDA_VISIBLE_DEVICES=0 python -u main_dpo_batch.py --model='qwen25omni_7b' --input_type='audiovideo' --output_type='preferencestrict' --prompt='cot2' --llm='qwen25'  --totalround=2
CUDA_VISIBLE_DEVICES=0 python -u main_dpo_batch.py --model='qwen25omni_7b' --input_type='audiovideo' --output_type='preferencestrict' --prompt='cot3' --llm='qwen25'  --totalround=2

CUDA_VISIBLE_DEVICES=0 python -u main_dpo_batch.py --model='qwen25omni_7b' --input_type='audiovideo' --output_type='preferencestrictreverse' --prompt='normal' --totalround=2
CUDA_VISIBLE_DEVICES=0 python -u main_dpo_batch.py --model='qwen25omni_7b' --input_type='audiovideo' --output_type='preferencestrictreverse' --prompt='cot'    --totalround=2
CUDA_VISIBLE_DEVICES=0 python -u main_dpo_batch.py --model='qwen25omni_7b' --input_type='audiovideo' --output_type='preferencestrictreverse' --prompt='cot2' --llm='qwen25'  --totalround=2
CUDA_VISIBLE_DEVICES=0 python -u main_dpo_batch.py --model='qwen25omni_7b' --input_type='audiovideo' --output_type='preferencestrictreverse' --prompt='cot3' --llm='qwen25'  --totalround=2
```

### Intermediate Outputs
We provide intermediate results in *./output-matching* folder



## 🔒 License

This project is released under the Apache 2.0 license as found in the LICENSE file. The service is a research preview intended for **non-commercial use ONLY**. Please get in touch with us if you find any potential violations.
