## chair experiement
### black box
18 test, done it twice (all, sub1000)
- cfg = 0, 0.3, 0.6
- I3, I4
- 13b, 7b, 7b-lora

answer dir:
- all: /data/linxi/workspace/POPE/llava_qa/answer_all
- sub1000: /data/linxi/workspace/POPE/llava_qa/answer_sub1000

eval results:
- all: /data/linxi/workspace/POPE/llava_chair/all
- sub1000: /data/linxi/workspace/POPE/llava_chair/sub1000

#### 
- 13b with cfg shows better performance than the original one
- 7b with cfg is not that good

### white box
answer dir:
/data/linxi/workspace/POPE/llava_performance_cfg/answer

1. /data/linxi/workspace/POPE/llava_performance_cfg/answer_old
/data/linxi/workspace/POPE/llava_performance_cfg/answer_old/I3_sub240_control_7b_lightning-preview-DETR-v2-pretrain-1-tune-1_cfg0.0.jsonl
interesting! detr branch descibes more details
2. however, when changing the position of this two branch, the performance degraded. 
This is because the detr branch is not well trained.
10 + random vs. random + random?
check the logits always
/data/linxi/workspace/POPE/llava_performance_cfg/answer_old/I3_sub240_control_7b_DETR-v2-pretrain-1-tune-1-lightning-preview_cfg0.0.jsonl

bash ./llava/backbone/detr_branch/eval_cfg3.sh
bash ./llava/backbone/detr_branch/eval_cfg2.sh

## 1219
### 1. llava chair black box
I4_control
- 7b-lightning-preview
cfg = 0, 0.3, 0.6
- 13b-lightning-preview
cfg = 0, 0.3, 0.6

### 2. llava chair grey box

### 3. llava chair white box
