# Automatic Self-Enhancing Prompt Learning (AutoSEP)

Paper: Unlabeled Data Improves Fine-Grained Image Zero-shot Classification with Multimodal LLMs

Submission Number: 11126

## Important environment requirements
```
clip
google-genai==1.2.0
google-generativeai==0.8.4
google-api-core==2.24.1
openai==1.64.0
python-liquid==1.13.0
sglang
torch==2.5.1
torchvision==0.20.1
transformers==4.50.0
```

## AutoSEP
### AutoSEP optimization
```
cd autosep
python main.py --data_dir /dataset/location --model gemini --gradient_mode gemini --task_name CUB_cuckoo --n_train 30 --test_eval --rounds 6 --beam_size 4 --minibatch_size 50 --n_gradients 4 --mc_samples_per_step 1 --max_expansion_factor 5 --out_num 1
```

### Prompt evaluation
Instance-level classification for prompts during the AutoSEP optimization:

(Setting `--parallel` will speed up the process.)
```
cd autosep
python llm_text_compare.py --evaluate --generate --parallel --result_folder autosep --data_dir /dataset/location --model gemini --exp 1 --task_name CUB_cuckoo --mode train --n_test 30 --n_compare 10
```

Clase-wise classification:
```
python classification.py --attributes --generate --parallel --result_folder autosep --data_dir /dataset/location --model gemini --exp 1 --task_name CUB_cuckoo --mode test --n_test 30 --prompt_idx 10 --out_num 1
```

## Baselines
### Optimization-free
#### Vanilla zero-shot
```
python multi_zero_shot.py --parallel --data_dir /dataset/location --model gemini --task_name CUB_cuckoo --mode test --n_test 30 --out_num 1
```

#### Zero-shot with descriptions
```
python multi_zero_shot.py --attributes --generate --parallel --data_dir /dataset/location --model gemini --task_name CUB_cuckoo --mode test --n_test 30 --out_num 1
```

#### Zero-shot with majority vote
```
cd baseline
python majority_vote.py --parallel --data_dir /dataset/location --model gemini --task_name CUB_cuckoo --mode test --n_test 30 --temperature 0.7 --n_votes 5 --out_num 1
```

#### Few-shot with random labels
```
cd baseline
python multi_random_label.py --random_labels --parallel --data_dir /dataset/location --model gemini --task_name CUB_cuckoo --mode test --n_test 30 --n_examples 5 --seed 1
```

#### Multiple images display
```
cd baseline
python multi_random_label.py --parallel --data_dir /dataset/location --model gemini --task_name CUB_cuckoo --mode test --n_test 30 --n_examples 5 --seed 1
```

#### K-means clustering
```
cd baseline
python cluster_img.py --data_dir /dataset/location --model gemini --task_name CUB_cuckoo --mode test --n_test 30 --device cuda:0 --n_clusters 7 --n_examples 3 --seed 1
```

### Optimization-based
#### Optimization with random labels
```
cd baseline
python main.py --model gemini --gradient_mode gemini --task_name CUB_cuckoo --data_dir /dataset/location --n_train 30 --test_eval --method random_label --rounds 6 --beam_size 4 --minibatch_size 50 --out_num 1
```

#### Optimization with majority vote
```
cd baseline
python main.py --model gemini --gradient_mode gemini --task_name CUB_cuckoo --data_dir /dataset/location --n_train 30 --test_eval --method majority_vote --temperature 0.7 --n_votes 5 --rounds 6 --beam_size 4 --minibatch_size 50 --out_num 2
```

#### SPO
```
cd spo
python main.py --model gemini --gradient_mode gemini --task_name CUB_cuckoo --data_dir /dataset/location --n_train 30 --test_eval --rounds 10 --beam_size 1 --minibatch_size 7 --n_gradients 1 --errors_per_gradient 3 --mc_samples_per_step 0 --max_expansion_factor 1 --out_num 1
```
