We are aware of the fact that the time that we add the tabular data is not exactly matched with
the time that the KB is constructed. Hence there could be a slight discrepency.

However, the encyclopedic VQA experiments are focused on the mutlimodal query and thus the impact is neglible.
Hence, for the 

# Document processing
1. parquet_to_jpg.py
2. download_omitted_images.py
3. remove_nonexisting_images.py
4. add_tabular_modality.py
5. construct_debugging_doc.py  # This should be done after "Query processing - 1.preprocess.py"
6. construct_test_kb.py
7. construct_train_eval_kb.py

# Query processing
1. preprocess.py
2. mapping_function.py
3. delete_unused_data.py


