This compressed file provides human behavior experiment data and the code used for modeling. The human behavior experiment data can be independently used by anyone for relevant research after the official release of this paper. Please refer to the guideline_for_Utilizing_Human_Behavior_Experiment_Data.doc file for details.

This file consists of a total of 7 subfolders, one environment configuration file (environment.yaml), and a readme.txt file. Model files (*.pth) are not provided due to size limitations. Users are expected to train the model themselves and place the resulting files in the <Model_folder>.

To train the model, you need the <train_img_128.pkl> file, which contains CheXpert image files converted to numpy format. Download the CheXpert 1.0 version compressed file from CheXpert Competition (https://stanfordmlgroup.github.io/competitions/chexpert/), and extract it to a subfolder in the 0.make_image_numpy_file folder. 

Running the <imgtonumpy.py> file will convert the images used in the experiment to numpy format and save them as the <train_img_128.pkl> file. You can modify the code to change the location of the CheXpert files. The <train_img_128.pkl> file is expected to be located in the same folder as this readme.txt file.

<1. subject_measurement_data> folder contains the raw data of the behavior experiments, divided into Group A and Group B, with individual files for each participant. Refer to the <guideline_for_Utilizing_Human_Behavior_Experiment_Data.doc> for interpretation of these files.

<2. Similarity_pattern_vector> folder contains the SPV calculated using all subject data to produce the results shown in Figure 3 of the paper. If the data folder location is correct, running <1.get_SPV.py> will complete the graph without any further manipulation.

<3. modeling_human_model_code> contains the training code for the main study, which trains a similarity embedding model using subject data. The random seeds used for selecting the final model are provided in the <random_seed.csv> file. The results of the assignment to training, validation, and evaluation sets for each experiment are in the random_triplet_assignment folder, and the code automatically recognizes the assignment for each experiment iteration.

<main_coder_training_and_calculate_SP.py> is the training code where you can input the group, subject ID, and random seed upon execution. Setting the experiment mode to 0 will run the standard experiment, while selecting 1 or 2 will run the ablation experiment settings.

After completing the model training for all subjects within a single group (from 1 to 62 or from 63 to 121), you can calculate the NSP. The model files are expected to be located in the <Model_folder>. Run <2.calculate_NSP.py> and input the relevant subject number to see the results. For example, to calculate the NSP for participant 23, it will read the evaluation triplet information of subjects from 1 to 62, excluding 23, and output the average accuracy of the inference.

<4.Annotation_study> folder contains the code for the qualitative analysis to produce Figure 5. It includes separate code files for each of the four figures, from <figure_A1.py> to <figure_B2.py>, along with example model files, allowing immediate execution.

<5.modeling_simulation_model_code> folder contains the code files related to the simulation experiment. <0.example_code_for_training_primary_model.py> trains a primary model that replaces human participants using separate image files. Since separate images are needed for training this model, we provide 16 pre-trained model files instead. Thus, there is no need to run <0.example_code_for_training_primary_model.py> separately. After preparing the human replacement model files, you can run <1.simulation_similarity_measurement.py> to query the similarity for unmeasured triplets using the human replacement model. Triplets are provided in meta***.csv files, where the numbers indicate the quantity of triplets provided. The default setting is an experiment with 500 subsets, hence a file providing 1500 triplets. You need to specify a CSV file to store the virtual behavior experiment data obtained by the human replacement model. You can specify the  save file name at the beginning of the code execution. Once the simulation experiment is complete, train the embedding model in the same manner as the human cognitive modeling by running <2.training_embedding_model.py> and entering the appropriate requirments.