1. DETR
llava-llama-2-7b-chat-DETR-pretrain-1000

llava-llama-2-7b-chat-DETR-pretrain-1000-tune-1

2. DETR-v2
llava-llama-2-7b-chat-DETR-v2-pretrain-1
detr-v2-pretrain-1-solar-yogurt-183
llava-llama-2-7b-chat-DETR-v2-pretrain-1-tune-1

3. DETR as controllable path
a) set vanilla one as cfg branch
b) load clip model with mm_projector, and caculate neg_prompt as input
c) check trainable layers
d) tune