# GLaVE-Cap

```
├── data (video & keyframes & masked keyframes here)
├── output ([video_name].json, include local captions, video caption, QAs)
├── config.yaml  (config input/output paths here)
├── FrameInfo.py
├── main.py
├── Overview.py
├── ProcessFrame.py
├── Summary.py
├── Question.py
├── ProcessVideo.py
├── prompts.py
└── gpt_model.py (need api_base & api_key)
```

