This directory includes the dataset of Physico-core set, which is manually depicted according to 52 phyisical concepts.
Each physical concepts contains several phenomena, and each phenomenon contains two instances. Thefore, there all 400 instances totally.

This dataset contains two types of files: meta_data{ID}.json and figure{ID}.jpg (or meta_data_{ID}.json and figure_{ID}.jpg)) and each file corresponds to an instance. 
Here *.json is an instance used for text-based LLMs and *.jpg is an instance used for visual-based LLMs. 
Each instance consists of three input-output pairs which represent a physical concept.
Note that each json file is paired with a jpg file in terms of ID.  

Specifically, for example, figure0.jpg and meta_data0.json correspond to the same instance yet with different formats.
figure0.jpg depicts three input-output subfigure pairs, leading to six subfigures in total. 
meta_data0.json includes the following fields:

"question": it is a list of input and output matrices, here both input and output matrices are with the same shape. Each element ranges from 0 to 7, and each number denotes a particular color in the corresponding jpg file.
"gt": it specifies the ground-truth physical concept for the current instance.
"choices": it is a list of candidate concepts including the ground-truth one, which is used for multiple choice selection tasks.

