# Supplementary Material

This repository contains supplementary material for our study. It consists of two main directories:

---

## 1. `Raw_data/`
- Provides the **SubSDIC** dataset in CSV format.  
- SubSDIC is the only dataset among those used in our experiments that can be publicly released.  

---

## 2. `Experiment_result/`
Contains the complete experimental results, organized into four subdirectories:

### (a) `Distribution_similarity_test/`
- Includes four files that evaluate distributional similarity:
  - **KS tests**:  
    - `ks_test_continuous_variables_CPHS_vs_SynthCPHS.pdf`  
    - `ks_test_continuous_variables_SubSDIC_vs_SDIC.pdf`  
  - **JS divergence**:  
    - `js_divergence_plot_CPHS_vs_SynthCPHS.pdf`  
    - `js_divergence_plot_SubSDIC_vs_SDIC.pdf`  
- These validate that SynthCPHS and SubSDIC replicate the distributions of their corresponding original datasets.

### (b) `CPHS/`
### (c) `SynthCPHS/`
### (d) `SubSDIC/`
Each of these dataset folders contains all experimental results for the respective dataset:
- **`Downstream_performance/`**  
  - Results for classification (ACC, ROC) and, where applicable, regression (RMSE).  
- **`Imputation_performance/`**  
  - Contains two subfolders:  
    - `In_sample/`: Results on the training set.  
    - `Out_sample/`: Results on the test set.  
- **`Imputation_Time_Efficiency.csv`**  
  - Records the runtime of each imputation method on the dataset.

---

## Notes
- All code runs with **default hyperparameters**, as provided in our GitHub repository.  
