script* files - Contain various script files to run the experiments described in the paper.

figures_and_results_processing.ipynb - Converts the results generated by the scripts to the figures seen in the paper.

*.py files - various utilities for the experimental scripts.

third_party - Contains our implemention of TransformerLens, forked from the original repository. It contains various implementations of the VLMs we analyze in the paper. We verify for each VLM TL wrapper that it reaches identical results to the HuggingFace implementation, to avoid errors.

data - Contains the textual prompts (no images due to space limitations). These can also be generated by using the code, but we wanted to present the full data without the reviewers having to run the code themselves.

data_generation - A folder containing some script to generate/scrape/label images for the analyzed tasks.