## Quick Start

TL DR: Our KGCS4Tab is realized in `main.py`

### 1. Prepare Python Environment

For obtaining dataset, it is necessary to prepare a environment to access TableShift benckmark. For any detail of TableShift, please visit TableShift at [tableshift.org](https://tableshift.org/index.html).
```shell
conda env create -f environment.yml
conda activate tableshift
python examples/run_expt.py
```
The final line above will print some detailed logging output as the script executes. When you see `training completed! test accuracy: xxxxx` your environment is ready to go! (Accuracy may vary slightly due to randomness.)

### 2. Obtain the dataset
You need to run to obtain the used dataset. 
```shell
python cache_dataset.py --experiment xxx
```
Here the arg experiment can be **anes, brfss_blood_pressure, acsincome, acspubcov**

### 3. Run our methods
Our KGCS4Tab is realized in `main.py`

To run our method, you first need to set your DeepSeek API key to main.py at line 85.

Modift load_dataset function to your dataset path at line 7.

Then run our method with command
```shell
python main.py --model xxx
```
for config model, we have 'cat','light,'gbdt' to choose, representing CatBoost, LightGBM, GBDT model

Results will show at your terminal.

Due to supplementary material size limitations, we are unable to include intermediate results such as the generated rules and augmented datasets (which exceed 100GB). We plan to release all these resources publicly in the future to facilitate further research.

