First, run webvid_qa.py to generate the QA pair dataset.

Then run format.py to convert the format to be prepared for training.