# Llava Video

## Annotation Steps
- Sample at 1 fps
- Get level-1 caption for each 10 seconds interval
- Get level-2 caption for each 30 seconds interval
- Get final caption

## Modifications in reproduction
- ✅Use a newer version of gpt-4o: 0806